FIT5215: Deep Learning (2023)¶


CE/Lecturer: Dr Trung Le | trunglm@monash.edu
Head Tutor: Mr Tuan Nguyen | tuan.Ng@monash.edu

Department of Data Science and AI, Faculty of Information Technology, Monash University, Australia


Student Information¶


Surname: [Yee]
Firstname: [Darren Jer Shien]
Student ID: [31237223]
Email: [dyee0005@student.monash.edu]
Your tutorial time: [10AM Monday]


Deep Neural Networks¶

Due: 11:59pm Sunday, 10 September 2023 (Sunday)¶

Important note: This is an individual assignment. It contributes 20% to your final mark. Read the assignment instruction carefully.¶

Instruction¶

This notebook has been prepared for you to complete Assignment 1. The theme of this assignment is about practical knowledge and skills in deep neural networks, including feedforward and convolutional neural networks. Some sections have been partially completed to help you get started. The total marks for this notebook is 100.

  • Before getting started, you should read the entire notebook carefully once to understand what you need to do.

  • For each cell marked with #YOU ARE REQUIRED TO INSERT YOUR CODES IN THIS CELL, there will be places where you must supply your own codes when instructed.

This assignment contains three parts:

  • Part 1: Questions on theory and knowledge on deep learning [35 points], 35%
  • Part 2: Coding assessment on TensorFlow for Deep Neural Networks (DNN) [25 points], 25%
  • Part 3: Coding assessment on TensorFlow for Convolution Neural Networks (CNN) [40 points], 40%

Hint: This assignment was essentially designed based on the lectures and tutorials sessions covered from Week 1 to Week 5. You are strongly recommended to go through these contents thoroughly which might help you to complete this assignment.

What to submit¶

This assignment is to be completed individually and submitted to Moodle unit site. By the due date, you are required to submit one single zip file, named xxx_assignment01_solution.zip where xxx is your student ID, to the corresponding Assignment (Dropbox) in Moodle.

For example, if your student ID is 12356, then gather all of your assignment solution to folder, create a zip file named 123456_assignment01_solution.zip and submit this file.

Within this zip folder, you must submit the following files:

  1. Assignment01_solution.ipynb: this is your Python notebook solution source file.
  2. Assignment01_output.html: this is the output of your Python notebook solution exported in html format.
  3. Any extra files or folder needed to complete your assignment (e.g., images used in your answers).

Since the notebook is quite big to load and work together, one recommended option is to split solution into three parts and work on them seperately. In that case, replace Assignment01_solution.ipynb by three notebooks: Assignment01_Part1_solution.ipynb, Assignment01_Part2_solution.ipynb and Assignment01_Part3_solution.ipynb

You can run your codes on Google Colab. In this case, you have to make a copy of your Google colab notebook including the traces and progresses of model training before submitting.

You also need to store your trained models to folder *./models* with recognizable file names (e.g., Part3_Sec3_2_model.h5).

Part 1: Theory and Knowledge Questions¶

[Total marks for this part: 35 points]

The first part of this assignment is to demonstrate your knowledge in deep learning that you have acquired from the lectures and tutorials materials. Most of the contents in this assignment are drawn from the lectures and tutorials from weeks 1 to 3. Going through these materials before attempting this part is highly recommended.

**Question 1.1** Activation function plays an important role in modern Deep NNs. For each of the activation function below, state its output range, find its derivative (show your steps), and plot the activation fuction and its derivative¶

**(a)** Exponential linear unit (ELU): $\text{ELU}(x)=\begin{cases} 0.1\left(\exp(x)-1\right) & \text{if}\,x\leq0\\ x & \text{if}\,x>0 \end{cases}$

[2 points]

**(b)** Gaussian Error Linear Unit (GELU): $\text{GELU}(x)=x\Phi(x)$ where $\Phi(x)$ is the probability cummulative function of the standard Gaussian distribution or $\Phi(x) = \mathbb{P}\left(X\leq x\right)$ where $X\sim\mathcal{N}\left(0,1\right)$. In addition, the GELU activation fuction (the link for the main paper) has been currently widely used in the state-of-the-art Vision for Transformers (e.g., here is the link for the main ViT paper).

[2 points]

Your ansewers here

**Question 1.1**
Derrivative of Exponential linear unit (ELU): $\text{ELU}(x)=\begin{cases} 0.1\left(\exp(x)\right) & \text{if}\,x\leq0\\ 1 & \text{if}\,x>0 \end{cases}$

Steps:

x <= 0
= 1/10 (d/dx[exp(x)] + d/dx[-1])
= (exp(x) +0) /10
= 0.1 (exp(x))


x > 0 
= d/dx(x)
= 1


output range = [-0.1,infinity]

In [1]:
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf

# Elu code
def elu (x, alpha):
    return np.where(x > 0,x, alpha*(np.exp(x)-1))

def elu_derivative(x, alpha):
    return np.where(x > 0, 1, alpha*np.exp(x))

# Plotting code
x = np.linspace(-5, 5, 400)
alpha = 0.1
elu_values = elu(x, alpha)
elu_derivative_values = elu_derivative(x, alpha)

plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 2)
plt.plot(x, elu_values, label="ELU")
plt.plot(x, elu_derivative_values, label="ELU's Derivative")
plt.title("ELU and Derivative Values")
plt.xlabel('x')
plt.ylabel("f'(x)")
plt.legend()

plt.tight_layout()
plt.show()

Formula of GELU =

GELU(x)=xP(X≤x)=xΦ(x)


which can be approximated to 


$0.5x(1+tanh[√(2/π)−(x+0.044715x^3)])$


Derivative of gelu

= d/dx ($0.5x(1+tanh[√(2/π)−(x+0.044715x^3)])$)


output range = [-infinity,inifinty]

In [2]:
#Gelu code
def gelu(x):
    cdf = 0.5 * (1.0 + tf.math.erf(x / tf.sqrt(2.0)))
    return x * cdf

def gelu_derivative(x):
    with tf.GradientTape() as tape:
        tape.watch(x)
        y = gelu(x)
    return tape.gradient(y, x)

#Plotting code
x = np.linspace(-5, 5, 400)
x_tf = tf.constant(x, dtype=tf.float32)
gelu_values = gelu(x_tf)
gelu_derivative_values = gelu_derivative(x_tf)

plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 2)
plt.plot(x, gelu_values, label="GELU")
plt.plot(x, gelu_derivative_values, label="GELU's Derivative")
plt.title("GELU and Derivative Values")
plt.xlabel('x')
plt.ylabel("f'(x)")
plt.legend()

plt.tight_layout()
plt.show()

**Numpy is possibly being used in the following questions. You need to import numpy here.**

In [3]:
import numpy as np

**Question 1.2** Assume that we feed a data point $x$ with a ground-truth label $y=3$ to the feed-forward neural network with the ReLU activation function as shown in the following figure¶

**(a)** What is the numerical value of the latent presentation $h^1(x)$?

[1 point]

**(b)** What is the numerical value of the latent presentation $h^2(x)$?

[1 point]

**(c)** What is the numerical value of the logit $h^3(x)$?

[1 point]

**(d)** What is the corresonding prediction probabilities $p(x)$?

[1 point]

**(e)** What is the predicted label $\hat{y}$? Is it a correct and an incorect prediction? Remind that $y=3$.

[3 point]

**(f)** What is the cross-entropy loss caused by the feed-forward neural network at $(x,y)$? Remind that $y=3$.

[1 point]

**(g)** Why the cross-entropy loss caused by the feed-forward neural network at $(x,y)$ (i.e., $\text{CE}(1_y, p(x))$) is always non-negative? When does this $\text{CE}(1_y, p(x))$ loss get the value $0$? Note that you need to answer this question for a general pair $(x,y)$ and a general feed-forward neural network with for example $M=4$ classes?

[3 point]

You need to show both formulas and numerical results for earning full mark. Although it is optional, it is great if you show your numpy code for your computation.

In [4]:
# Answers for Questions
x = np.matrix([[1],[-1],[-2]])

w1 = np.matrix([[1,-1,2],[-1,0.5,1],[-2,1,2],[0,0,1]])
w2 = np.matrix([[-1,1,0,1],[1,1,0,-2],[0.5,-1,2,0]])
w3 = np.matrix ([[1,-2,0],[0,2,0],[1,-1,1],[0.5,1,2]])

b1 = np.matrix ([[1],[0],[1],[0]])
b2 = np.matrix ([[0],[0.5],[1]])
b3 = np.matrix ([[-1],[1],[-1],[1]])

# h1
a1 = np.maximum(np.dot(w1, x) + b1, 0)
#h2
a2 = np.maximum(np.dot(w2, a1) + b2, 0)
#h3 logit
a3 = np.dot(w3, a2) + b3

print ("a)\n")
print (a1,"\n")
print ("b)\n")
print (a2,"\n")
print ("c)\n")
print (a3,"\n")

def softmax(x):
    """Compute softmax values for each sets of scores in x."""
    e_x = np.exp(x - np.max(x))
    return e_x / e_x.sum(axis=0)

#prediction probability
prediction_probability = softmax(a3)
                                 
print ("d)\n")
print (prediction_probability,"\n")
print ("e)\n")
#prediction label
predicted_label = np.argmax(prediction_probability)
print ("Predicted label = ",predicted_label + 1,"\n")
print ("f)\n")
one_hot = np.zeros(prediction_probability.shape,dtype=int)
one_hot[predicted_label] = 1
target_index = 3
#ce loss
ce_loss = -np.log(prediction_probability[target_index-1])
print ("Cross entropy loss =",ce_loss[0,0],"\n")
print ("g)\n")
a)

[[0.]
 [0.]
 [0.]
 [0.]] 

b)

[[0. ]
 [0.5]
 [1. ]] 

c)

[[-2. ]
 [ 2. ]
 [-0.5]
 [ 3.5]] 

d)

[[0.00328114]
 [0.17914438]
 [0.01470507]
 [0.80286941]] 

e)

Predicted label =  4 

f)

Cross entropy loss = 4.219563205900562 

g)

For a general (x,y) pair and a general Neural Network with M = 4 classes, we can use it to prove why CE loss is always non-nagative. The calculation of CE loss involves us using the one-hot vector (all values are 0 except for the true class) and taking the negative log value between it and the original predicted probabilities using Softmax.

For example: the one-hot vector could be [0,1,0,0] and the predicted probabilities could be [0.2,0.4,0.2,0.2].
We calcuate CE loss by taking the summation of the negative log of each value in the one-hot vector and their corresponding probabilities

(-np.sum(one_hot np.log(predicted))*

Because of that, it means that we can ignore all the other classes that are not true since their 0 value in the one-hot vector would mean that their values will not be included in the loss anyways. Log of any value between 0 and 1 (inclusive) is always negative and since our one_hot value for our true class is always one, we can deduce that the values (one_hot * np.log(predicted)) will always be negative. Then, by taking the negative sign outside into account, we can prove that the value will always be non-negative due to double negation.

For Question 1.3, you have two options: (i) do forward, backward propagation, and SGD update for one data example (15 points) and (ii) do forward, backward propagation, and SGD update for a batch of data examples (20 points). You can choose either (i) or (ii) to proceed.

**Option 1**¶

[Total marks for this option: 15 points]

**Question 1.3** Assume that we are constructing a multilayered feed-forward neural network for a classification problem with three classes where the model parameters will be generated randomly using your student ID. The architecture of this network is ($3 (Input)\rightarrow 5(ELU) \rightarrow 3(Output)$) as shown in the following figure. Note that the ELU has the same formula as the one in Q1.1.¶

We feed a data example $x$ with the label $y$ as shown in the figure. Answer the following questions.

You need to show both formulas, numerical results, and your numpy code for your computation for earning full marks.

In [5]:
#Code to generate random matrices and biases for W1, b1, W2, b2
import numpy as np
student_id = 31237223           #insert your student id here for example 1234    
np.random.seed(student_id)
W1 = np.random.rand(5,3)
b1 = np.random.rand(5,1)
W2 = np.random.rand(3,5)
b2 = np.random.rand(3,1)

Forward propagation

**(a)** What is the value of $\bar{h}^{1}(x)$?

[1 point]

Show your fomular

In [ ]:
 

**(b)** What is the value of $h^{1}(x)$?

[1 point]
In [ ]:
 

Show your fomular

In [ ]:
 

**(c)** What is the predicted value $\hat{y}$?

[1 point]

Show your fomular

In [ ]:
 

**(d)** Suppose that we use the cross-entropy (CE) loss. What is the value of the CE loss $l$?

[1 point]

Show your fomular

In [ ]:

Backward propagation

**(e)** What are the derivatives $\frac{\partial l}{\partial h^{2}},\frac{\partial l}{\partial W^{2}}$, and $\frac{\partial l}{\partial b^{2}}$?

[4 points]

Show your fomular

In [6]:
#Show your code

**(f)** What are the derivatives $\frac{\partial l}{\partial h^{1}}, \frac{\partial l}{\partial \bar{h}^{1}},\frac{\partial l}{\partial W^{1}}$, and $\frac{\partial l}{\partial b^{1}}$?

[4 points]

Show your fomular

In [7]:
#Show your code

SGD update

**(g)** Assume that we use SGD with learning rate $\eta=0.01$ to update the model parameters. What are the values of $W^2, b^2$ and $W^1, b^1$ after updating?

[3 points]

Show your fomular

In [8]:
#Show your code

**Option 2**¶

[Total marks for this option: 20 points]

**Question 1.3** Assume that we are constructing a multilayered feed-forward neural network for a classification problem with three classes where the model parameters will be generated randomly using your student ID. The architecture of this network is ($3 (Input)\rightarrow 5(ELU) \rightarrow 3(Output)$) as shown in the following figure. Note that the ELU has the same formula as the one in Q1.1.¶

We feed a batch $X$ with the labels $Y$ as shown in the figure. Note that $x^{T}$ represents the transpose vector of the vector $x$. Answer the following questions.

You need to show both formulas, numerical results, and your numpy code for your computation for earning full marks.

In [203]:
#Code to generate random matrices and biases for W1, b1, W2, b2
import numpy as np
student_id = 31237223 #insert your student id here for example 1234    
np.random.seed(student_id)
W1 = np.random.rand(5,3)
b1 = np.random.rand(5,1)
W2 = np.random.rand(3,5)
b2 = np.random.rand(3,1)

Forward propagation

**(a)** What is the value of $\bar{h}^{1}(x)$?

[1 point]

Show your formula

In [204]:
# Show your code
def elu(x, alpha=0.1):
    """ELU activation function."""
    y = np.where(x > 0, x, alpha * (np.exp(x) - 1))
    return y

# Given input examples
x1 = np.matrix([ 1, -1,  1]).T
x2 = np.matrix([-1,  2, -1]).T
x3 = np.matrix([-1.5, 1,  0]).T
x4 = np.matrix([-1,  2, -1]).T
x5 = np.matrix([ 0,  2.5, 1.5]).T

# Stack the input examples to create a mini-batch matrix
mini_batch_matrix = np.hstack((x1, x2, x3, x4, x5))

qA = np.dot(W1,mini_batch_matrix) + b1
print ("answer:\n",qA)
answer:
 [[ 1.01009806  0.30364112 -0.17987178  0.30364112  2.93026733]
 [ 0.45984992  1.08805968  0.05061028  1.08805968  2.59282736]
 [ 1.11497155  1.27251212  0.33600097  1.27251212  3.0365595 ]
 [ 1.27415222  1.2745985   1.22620037  1.2745985   3.40831993]
 [ 0.64008934  0.37189129 -0.47838965  0.37189129  1.50595585]]

**(b)** What is the value of $h^{1}(x)$?

[1 point]

Show your formula

In [205]:
#Show your code
qB = elu(qA)
print ("answer:\n",qB)
answer:
 [[ 1.01009806  0.30364112 -0.01646227  0.30364112  2.93026733]
 [ 0.45984992  1.08805968  0.05061028  1.08805968  2.59282736]
 [ 1.11497155  1.27251212  0.33600097  1.27251212  3.0365595 ]
 [ 1.27415222  1.2745985   1.22620037  1.2745985   3.40831993]
 [ 0.64008934  0.37189129 -0.03802193  0.37189129  1.50595585]]

**(c)** What is the predicted value $\hat{y}$?

[1 point]

Show your formula

In [206]:
#Show your code
# correct solution:
qC = np.dot(W2,qB) + b2

def softmax(x):
    """Compute softmax values for each sets of scores in x."""
    e_x = np.exp(x - np.max(x))
    return e_x / e_x.sum(axis=0) 

# Get softmax values
softmax_qC = softmax(qC)
prediction_qC = np.argmax(softmax(qC),axis=0)

# init one hot array
one_hot = np.zeros(softmax_qC.shape,dtype=int)
prediction_qC = np.expand_dims(prediction_qC, 0)

# create new one hot array based on prediction
np.put_along_axis(one_hot,prediction_qC,1,axis=0)

print ("softmax:\n",softmax_qC)
print ("one hot:\n",one_hot)
print ("prediction value\n",prediction_qC)
softmax:
 [[0.55341198 0.43520274 0.29568302 0.43520274 0.91412831]
 [0.23225884 0.20552616 0.25346639 0.20552616 0.03970293]
 [0.21432918 0.3592711  0.45085058 0.3592711  0.04616876]]
one hot:
 [[1 1 0 1 1]
 [0 0 0 0 0]
 [0 0 1 0 0]]
prediction value
 [[0 0 2 0 0]]

**(d)** Suppose that we use the cross-entropy (CE) loss. What is the value of the CE loss $l$?

[1 point]

Show your formula

In [207]:
#Show your code

# original target array = [2,1,3,1,2], scaled to fit array index
one_hot_target = np.zeros(softmax_qC.shape,dtype=int)
target = [1,0,2,0,1]
target = np.expand_dims(target, 0)
np.put_along_axis(one_hot_target,target,1,axis=0)
ce = - np.sum(one_hot_target * np.log(softmax_qC)) / 5
print ("ce loss:\n",ce)
ce loss:
 1.4293477795659966

Backward propagation

**(e)** What are the derivatives $\frac{\partial l}{\partial h^{2}},\frac{\partial l}{\partial W^{2}}$, and $\frac{\partial l}{\partial b^{2}}$?

[6 points]

Show your formula

Part 1:
$\frac{\partial l}{\partial h^{2}}$ = $g2$ = $p^T - 1y ∈ ℝ^{1×n^2}$

Part 2:
$\frac{\partial l}{\partial W^{2}}$ = $\frac{\partial l}{\partial h^{2}} . \frac{\partial h^2}{\partial W^{2}}$
$\frac{\partial l}{\partial h^{2}}$ = $(g^2)^T$
$\frac{\partial h^2}{\partial W^{2}}$ = $(h^1)^T$
$\frac{\partial l}{\partial W^{2}}$ = $(g^2)^T . (h^1)^T ∈ ℝ^{n^2×n^1}$

Part 3:
$\frac{\partial l}{\partial b^{2}}$ = $\frac{\partial l}{\partial h^{2}} . \frac{\partial h^2}{\partial b^{2}}$
$\frac{\partial l}{\partial h^{2}}$ = $(g^2)^T$
$\frac{\partial h^2}{\partial b^{2}}$ = 1
$\frac{\partial l}{\partial b^{2}}$ = $(g^2)^T ∈ ℝ^{n^2×1}$

In [208]:
#Part 1:
p1 = softmax_qC - one_hot_target
print ("Part 1:\n",p1)
p2 = np.dot(p1,qB.T)
print ("\nPart 2:\n",p2)
p3 = p1.T
print ("\nPart 3:\n",p3)
matr = np.sum (p1.T,axis = 0)
print (matr.reshape(1, -1).T)
Part 1:
 [[ 0.55341198 -0.56479726  0.29568302 -0.56479726  0.91412831]
 [-0.76774116  0.20552616  0.25346639  0.20552616 -0.96029707]
 [ 0.21432918  0.3592711  -0.54914942  0.3592711   0.04616876]]

Part 2:
 [[ 2.88978173  1.41056169  2.05477067  2.74355998  1.29954118]
 [-3.46878122 -2.38285275 -3.16377471 -3.41649147 -1.79435842]
 [ 0.57899949  0.97229106  1.10900404  0.6729315   0.49481725]]

Part 3:
 [[ 0.55341198 -0.76774116  0.21432918]
 [-0.56479726  0.20552616  0.3592711 ]
 [ 0.29568302  0.25346639 -0.54914942]
 [-0.56479726  0.20552616  0.3592711 ]
 [ 0.91412831 -0.96029707  0.04616876]]
[[ 0.63362879]
 [-1.06351951]
 [ 0.42989072]]

**(f)** What are the derivatives $\frac{\partial l}{\partial h^{1}}, \frac{\partial l}{\partial \bar{h}^{1}},\frac{\partial l}{\partial W^{1}}$, and $\frac{\partial l}{\partial b^{1}}$?

[6 points]

Show your formula

Part 1:
$\frac{\partial l}{\partial h^{1}}$ = $g1$ = $g2 . W2 ∈ ℝ^{1×n^2}$

Part 2:
$\frac{\partial l}{\partial \bar{h}^{1}}$ = $\frac{\partial l}{\partial {h}^{1}} . \frac{\partial {h}^{1}}{\partial \bar{h}^{1}} $
$\frac{\partial l}{\partial {h}^{1}}$ = $g^1$
$\frac{\partial {h}^{1}}{\partial \bar{h}^{1}} $= $ELU'$($\bar{h}^{1}$)
$\frac{\partial l}{\partial \bar{h}^{1}}$ = $g^1ELU'$($\bar{h}^{1}$)

Part 3:
$\frac{\partial l}{\partial W^{1}}$ = $\frac{\partial l}{\partial \bar{h}^{1}} . \frac{\partial \bar{h}^{1}}{\partial {W}^{1}} $
$\frac{\partial l}{\partial \bar{h}^{1}}$ = $(\bar g^1)$
$\frac{\partial \bar{h}^{1}}{\partial {W}^{1}} $ = $(h^0)^T$
$\frac{\partial l}{\partial W^{1}}$ = $(\bar g^1)(h^0)^T$

Part 4:
$\frac{\partial l}{\partial b^{1}}$ = $\frac{\partial l}{\partial \bar{h}^{1}} . \frac{\partial \bar{h}^{1}}{\partial {b}^{1}} $
$\frac{\partial l}{\partial \bar{h}^{1}}$ = $(\bar g^1)$
$\frac{\partial \bar{h}^{1}}{\partial {b}^{1}}$ = 1
$\frac{\partial l}{\partial b^{1}}$ = $(\bar g^1)^T$

In [209]:
#Part 1

p1_f = np.dot (p1.T,W2)
print (p1_f)

def elu_derivative(x, alpha):
    return np.where(x > 0, 1, alpha*np.exp(x))

p2_f = np.dot (p1_f,elu_derivative(qA,0.1))

print (p2_f)

p3_f = np.dot (p2_f,mini_batch_matrix.T)

print (p3_f)

p4_f = p2_f.T

print (p4_f)
[[-0.19335577  0.0518627   0.13949576  0.42682301  0.44642821]
 [-0.23272077  0.05467927 -0.14588584 -0.09308672 -0.44906295]
 [ 0.39047495 -0.09584548  0.07857322 -0.16522643  0.23100242]
 [-0.23272077  0.05467927 -0.14588584 -0.09308672 -0.44906295]
 [-0.09032912  0.02835206  0.23229467  0.52259559  0.7339236 ]]
[[ 0.87125391  0.87125391  0.62969772  0.87125391  0.87125391]
 [-0.866077   -0.866077   -0.2315663  -0.866077   -0.866077  ]
 [ 0.43897868  0.43897868 -0.13556222  0.43897868  0.43897868]
 [-0.866077   -0.866077   -0.2315663  -0.866077   -0.866077  ]
 [ 1.4268368   1.4268368   0.82118359  1.4268368   1.4268368 ]]
[[-1.8158005   5.42159424  0.43562696]
 [ 1.21342645 -4.9949898  -0.4330385 ]
 [-0.23563535  2.27882052  0.21948934]
 [ 1.21342645 -4.9949898  -0.4330385 ]
 [-2.6586122   8.66878602  0.7134184 ]]
[[ 0.87125391 -0.866077    0.43897868 -0.866077    1.4268368 ]
 [ 0.87125391 -0.866077    0.43897868 -0.866077    1.4268368 ]
 [ 0.62969772 -0.2315663  -0.13556222 -0.2315663   0.82118359]
 [ 0.87125391 -0.866077    0.43897868 -0.866077    1.4268368 ]
 [ 0.87125391 -0.866077    0.43897868 -0.866077    1.4268368 ]]

SGD update

**(g)** Assume that we use SGD with learning rate $\eta=0.01$ to update the model parameters. What are the values of $W^2, b^2$ and $W^1, b^1$ after updating?

[4 points]

Show your formula

In [210]:
#Show your code
learning_rate = 0.01

# Update W2 and b2 using SGD
W2 -= learning_rate * p2
b2 -= learning_rate * matr.reshape(1, -1).T

print ("W2:\n",W2,"\n","b2:\n",b2)

# Update W1 and b1 using SGD
W1 -= learning_rate * p3_f
# b1 -= learning_rate * np.sum(p4_f)

print ("W1:\n",W1,"\n","b1:\n",b1)
W2:
 [[0.729301   0.50671945 0.87949781 0.79949605 0.97970768]
 [0.85412348 0.52304093 0.67725963 0.31933795 0.2083518 ]
 [0.06961929 0.675661   0.62844156 0.87102426 0.19679131]] 
 b2:
 [[0.11045054]
 [0.69475357]
 [0.50990836]]
W1:
 [[0.80534526 0.6338631  0.59380351]
 [0.73795092 0.85351641 0.14549009]
 [0.82473595 0.72886885 0.22414081]
 [0.21397833 0.6319377  0.65097635]
 [0.84134583 0.38883133 0.02548392]] 
 b1:
 [[0.31283007]
 [0.37217154]
 [0.81791331]
 [0.98338146]
 [0.26823072]]

Part 2: Deep Neural Networks (DNN) ¶

[Total marks for this part: 25 points]

The first part of this assignment is to demonstrate your basis knowledge in deep learning that you have acquired from the lectures and tutorials materials. Most of the contents in this assignment are drawn from the tutorials covered from weeks 1 to 4. Going through these materials before attempting this assignment is highly recommended.

In the first part of this assignment, you are going to work with the FashionMNIST dataset for image recognition task. It has the exact same format as MNIST (70,000 grayscale images of 28 × 28 pixels each with 10 classes), but the images represent fashion items rather than handwritten digits, so each class is more diverse, and the problem is significantly more challenging than MNIST.

**Question 2.1**. Load the Fashion MNIST using Keras datasets¶

[5 points]

We first use keras incoporated in TensorFlow 2.x for loading the training and testing sets.

In [19]:
import tensorflow as tf
from tensorflow import keras
In [20]:
tf.random.set_seed(1234)

We first use keras datasets in TF 2.x to load Fashion MNIST dataset.

In [21]:
fashion_mnist = keras.datasets.fashion_mnist
(X_train_full_img, y_train_full), (X_test_img, y_test) = fashion_mnist.load_data()

The shape of X_train_full_img is $(60000, 28, 28 )$ and that of X_test_img is $(10000, 28, 28)$. We next convert them to matrices of vectors and store in X_train_full and X_test.

In [23]:
num_train = X_train_full_img.shape[0]
num_test = X_test_img.shape[0]

#Get X_train_full and X_test
X_train_full =  X_train_full_img.reshape(num_train,-1)
X_test =   X_test_img.reshape(num_test, -1)

#Print shape of test and train set
print("train set shape:\n",X_train_full.shape, y_train_full.shape)
print("test set shape: \n",X_test.shape, y_test.shape)
train set shape:
 (60000, 784) (60000,)
test set shape: 
 (10000, 784) (10000,)

**Question 2.2**. Preprocess the dataset and split into training, validation, and testing datasets¶

[5 points]

You need to write the code to address the following requirements:

  • Use $10 \%$ of X_train_full for validation and the rest of X_train_full for training. This splits X_train_full and y_train_full into X_train, y_train ($90 \%$) and X_valid, y_valid ($10 \%$).
  • Finally, scale the pixels of X_train, X_valid, and X_test to $[0,1]$) (i.e., $X = X/255.0$).

You have now the separate training, validation, and testing sets for training your model.

In [24]:
import math
N = X_train_full.shape[0]
i = math.floor(0.9*N)
n_classes= 10

shuffle = np.random.permutation(N)

#split and shuffle the dataset according to method taught in the tutorials 
valid_idx = math.floor(0.1*N)
X_train, y_train = X_train_full[shuffle][:i],y_train_full[shuffle][:i]
X_valid, y_valid = X_train_full[shuffle][i:i+valid_idx],y_train_full[shuffle][i:i+valid_idx]
X_train, X_valid, X_test = X_train/255.0 , X_valid/255.0 , X_test/255.0

print ('Train set', X_train.shape, y_train.shape)
print('Validation set', X_valid.shape, y_valid.shape)
print('Test set', X_test.shape, y_test.shape)
Train set (54000, 784) (54000,)
Validation set (6000, 784) (6000,)
Test set (10000, 784) (10000,)

**Question 2.3**. Write code for the feed-forward neural net using TF 2.x¶

[5 points]

We now develop a feed-forward neural network with the architecture $784 \rightarrow 40(ReLU) \rightarrow 30(ReLU) \rightarrow 10(softmax)$. You can choose your own way to implement your network and an optimizer of interest. You should train model in $50$ epochs and evaluate the trained model on the test set.

Baseline Accuracy:

  • Training Accuracy: 90.19%
  • Validation Accuracy: 87.65%
  • Test Accuracy: 87.21%
In [25]:
# Code adapted from tutorials
class DNN:
    def __init__(self,n1,n2,act,n_classes=10, optimizer=tf.keras.optimizers.SGD(learning_rate=0.01),
                 batch_size=32, epochs=1, alpha=0.001):
        self.n_classes = n_classes
        self.batch_size = batch_size
        self.epochs = epochs
        self.optimizer = optimizer
        self.alpha = alpha 
        self.n1 = n1
        self.n2 = n2
        self.act = act

        # create a tensorflow dataset for training
        self.train_set = tf.data.Dataset.from_tensor_slices((X_train, y_train))
        # create a tensorflow dataset for validation
        self.valid_set = tf.data.Dataset.from_tensor_slices((X_valid, y_valid))
        # create a tensorflow dataset for validation
        self.test_set = tf.data.Dataset.from_tensor_slices((X_test, y_test))
        # batching train and valid sets
        self.train_set = self.train_set.batch(self.batch_size).prefetch(1)
        self.valid_set = self.valid_set.batch(self.batch_size).prefetch(1)
        self.test_set = self.test_set.batch(self.batch_size).prefetch(1)
        tf.keras.backend.set_floatx('float64')


    def build(self):
        self.model = tf.keras.Sequential([
            tf.keras.layers.Dense(self.n1, activation= self.act),
            tf.keras.layers.Dense(self.n2, activation= self.act),
            tf.keras.layers.Dense(self.n_classes, activation='softmax')
        ])

    def compute_loss(self, X, y):  # X is data batch, y is label batch
        pred_probs = self.model(X)
        l1 = tf.keras.losses.sparse_categorical_crossentropy(y, pred_probs)  # Cross entropy loss
        l2 = tf.add_n([tf.nn.l2_loss(w) for w in self.model.trainable_weights])
        l2 = tf.expand_dims(l2, axis=-1)
        return l1 + self.alpha * l2


    def compute_grads(self, X, y):
        with tf.GradientTape() as g:  # use gradient tape to compute gradients
            loss = self.compute_loss(X, y)
        grads = g.gradient(loss, self.model.trainable_variables)  # compute gradients w.r.t. all trainable variables
        return grads

    def train_one_batch(self, X, y):  # train in one batch
        grads = self.compute_grads(X, y)
        # the gradients will be applied according to optimizer for example SGD, Adam, and etc.
        self.optimizer.apply_gradients(zip(grads, self.model.trainable_variables))

    def evaluate(self, tf_dataset=None):
        dataset_loss = tf.keras.metrics.Mean()
        dataset_accuracy = tf.keras.metrics.SparseCategoricalAccuracy()
        for X, y in tf_dataset:
            loss = self.compute_loss(X, y)
            dataset_loss.update_state(loss)
            dataset_accuracy.update_state(y, self.model(X, training=False))
        return dataset_loss.result(), dataset_accuracy.result()

    def train(self):
        for epoch in range(self.epochs):
            for X, y in self.train_set:  # use batch_index if you want to display something in iterations
                self.train_one_batch(X, y)
            train_loss, train_acc = self.evaluate(self.train_set)
            valid_loss, valid_acc = self.evaluate(self.valid_set)
            print('Epoch {}: train acc= {:.4f}, train loss= {:.4f} | valid acc= {:.4f}, valid loss= {:.4f}'.format(
                epoch + 1, train_acc, train_loss, valid_acc, valid_loss))
        return valid_acc,valid_loss
    
    def evaluate_test_set(self, tf_dataset=None):
        dataset_loss = tf.keras.metrics.Mean()
        dataset_accuracy = tf.keras.metrics.SparseCategoricalAccuracy()
        for X, y in self.test_set:
            loss = self.compute_loss(X, y)
            dataset_loss.update_state(loss)
            dataset_accuracy.update_state(y, self.model(X, training=False))
        return ('test acc = {:.4f}, test loss = {:.4f}'.format(
                dataset_accuracy.result().numpy(), dataset_loss.result().numpy()))
    
opt = tf.keras.optimizers.Adam()
dnn = DNN(40,30,'relu',optimizer=opt, epochs=50, batch_size=64)
dnn.build()
dnn.train()
dnn.evaluate_test_set()
Epoch 1: train acc= 0.8448, train loss= 0.5111 | valid acc= 0.8433, valid loss= 0.5197
Epoch 2: train acc= 0.8578, train loss= 0.4694 | valid acc= 0.8518, valid loss= 0.4884
Epoch 3: train acc= 0.8651, train loss= 0.4498 | valid acc= 0.8565, valid loss= 0.4749
Epoch 4: train acc= 0.8698, train loss= 0.4372 | valid acc= 0.8622, valid loss= 0.4687
Epoch 5: train acc= 0.8728, train loss= 0.4290 | valid acc= 0.8648, valid loss= 0.4646
Epoch 6: train acc= 0.8756, train loss= 0.4203 | valid acc= 0.8663, valid loss= 0.4598
Epoch 7: train acc= 0.8780, train loss= 0.4150 | valid acc= 0.8672, valid loss= 0.4557
Epoch 8: train acc= 0.8811, train loss= 0.4090 | valid acc= 0.8693, valid loss= 0.4532
Epoch 9: train acc= 0.8834, train loss= 0.4041 | valid acc= 0.8707, valid loss= 0.4501
Epoch 10: train acc= 0.8849, train loss= 0.4018 | valid acc= 0.8690, valid loss= 0.4492
Epoch 11: train acc= 0.8873, train loss= 0.3955 | valid acc= 0.8723, valid loss= 0.4443
Epoch 12: train acc= 0.8887, train loss= 0.3925 | valid acc= 0.8720, valid loss= 0.4431
Epoch 13: train acc= 0.8899, train loss= 0.3911 | valid acc= 0.8723, valid loss= 0.4440
Epoch 14: train acc= 0.8918, train loss= 0.3861 | valid acc= 0.8757, valid loss= 0.4394
Epoch 15: train acc= 0.8921, train loss= 0.3862 | valid acc= 0.8738, valid loss= 0.4398
Epoch 16: train acc= 0.8929, train loss= 0.3834 | valid acc= 0.8745, valid loss= 0.4371
Epoch 17: train acc= 0.8921, train loss= 0.3847 | valid acc= 0.8722, valid loss= 0.4393
Epoch 18: train acc= 0.8922, train loss= 0.3847 | valid acc= 0.8745, valid loss= 0.4420
Epoch 19: train acc= 0.8940, train loss= 0.3799 | valid acc= 0.8755, valid loss= 0.4372
Epoch 20: train acc= 0.8936, train loss= 0.3818 | valid acc= 0.8743, valid loss= 0.4398
Epoch 21: train acc= 0.8925, train loss= 0.3861 | valid acc= 0.8732, valid loss= 0.4425
Epoch 22: train acc= 0.8932, train loss= 0.3828 | valid acc= 0.8722, valid loss= 0.4410
Epoch 23: train acc= 0.8943, train loss= 0.3817 | valid acc= 0.8737, valid loss= 0.4413
Epoch 24: train acc= 0.8942, train loss= 0.3820 | valid acc= 0.8728, valid loss= 0.4422
Epoch 25: train acc= 0.8960, train loss= 0.3772 | valid acc= 0.8747, valid loss= 0.4386
Epoch 26: train acc= 0.8963, train loss= 0.3771 | valid acc= 0.8725, valid loss= 0.4370
Epoch 27: train acc= 0.8964, train loss= 0.3754 | valid acc= 0.8733, valid loss= 0.4376
Epoch 28: train acc= 0.8956, train loss= 0.3781 | valid acc= 0.8717, valid loss= 0.4388
Epoch 29: train acc= 0.8966, train loss= 0.3741 | valid acc= 0.8742, valid loss= 0.4360
Epoch 30: train acc= 0.8960, train loss= 0.3758 | valid acc= 0.8753, valid loss= 0.4384
Epoch 31: train acc= 0.8949, train loss= 0.3793 | valid acc= 0.8742, valid loss= 0.4425
Epoch 32: train acc= 0.8957, train loss= 0.3771 | valid acc= 0.8737, valid loss= 0.4409
Epoch 33: train acc= 0.8982, train loss= 0.3739 | valid acc= 0.8725, valid loss= 0.4397
Epoch 34: train acc= 0.8965, train loss= 0.3753 | valid acc= 0.8740, valid loss= 0.4408
Epoch 35: train acc= 0.8964, train loss= 0.3760 | valid acc= 0.8732, valid loss= 0.4419
Epoch 36: train acc= 0.8979, train loss= 0.3733 | valid acc= 0.8753, valid loss= 0.4400
Epoch 37: train acc= 0.8978, train loss= 0.3753 | valid acc= 0.8723, valid loss= 0.4420
Epoch 38: train acc= 0.8960, train loss= 0.3793 | valid acc= 0.8735, valid loss= 0.4455
Epoch 39: train acc= 0.8985, train loss= 0.3732 | valid acc= 0.8742, valid loss= 0.4417
Epoch 40: train acc= 0.8986, train loss= 0.3735 | valid acc= 0.8742, valid loss= 0.4427
Epoch 41: train acc= 0.8984, train loss= 0.3719 | valid acc= 0.8762, valid loss= 0.4397
Epoch 42: train acc= 0.8995, train loss= 0.3694 | valid acc= 0.8757, valid loss= 0.4373
Epoch 43: train acc= 0.8996, train loss= 0.3710 | valid acc= 0.8753, valid loss= 0.4415
Epoch 44: train acc= 0.8998, train loss= 0.3714 | valid acc= 0.8752, valid loss= 0.4422
Epoch 45: train acc= 0.9011, train loss= 0.3692 | valid acc= 0.8760, valid loss= 0.4396
Epoch 46: train acc= 0.9007, train loss= 0.3698 | valid acc= 0.8770, valid loss= 0.4419
Epoch 47: train acc= 0.9011, train loss= 0.3684 | valid acc= 0.8748, valid loss= 0.4407
Epoch 48: train acc= 0.9016, train loss= 0.3681 | valid acc= 0.8780, valid loss= 0.4394
Epoch 49: train acc= 0.9010, train loss= 0.3707 | valid acc= 0.8752, valid loss= 0.4425
Epoch 50: train acc= 0.9019, train loss= 0.3674 | valid acc= 0.8765, valid loss= 0.4411
Out[25]:
'test acc = 0.8721, test loss = 0.4527'

**Question 2.4**. Tuning hyper-parameters with grid search¶

[5 points]

Assume that you need to tune the number of neurons on the first and second hidden layers $n_1 \in \{20, 40\}$, $n_2 \in \{20, 40\}$ and the used activation function $act \in \{sigmoid, tanh, relu\}$. The network has the architecture pattern $784 \rightarrow n_1 (act) \rightarrow n_2(act) \rightarrow 10(softmax)$ where $n_1, n_2$, and $act$ are in their grides. Write the code to tune the hyper-parameters $n_1, n_2$, and $act$. Note that you can freely choose the optimizer and learning rate of interest for this task.

For this question, I have decided to use the original SGD optimizer with the learning rate of 0.001 in order to test it. In the output space below, you will find the results of each model configuration running 10 epochs each. The final result shows n1= 40, n2 = 40, activation function = tanh as the best configuration.

Best Config (40,40,tanh) Accuracy after 10 epochs:

  • Training Accuracy: 84.78%
  • Validation Accuracy: 85.25%
  • Test Accuracy: 85.25%
In [40]:
# Perform grid search using each possible configurtaions, compute 10 epochs and save the best values / configuration
lst_n1 = [20,40]
lst_n2 = [20,40]
lst_activation = ['sigmoid','tanh','relu']
best_acc= - np.inf
best_history = None
best_model = None
for n1 in lst_n1:
    for n2 in lst_n2:
        for act in lst_activation:
            print('\nThis is for model n1= {}, n2 = {}, activation function = {}'.format(n1,n2, act))
            opt = tf.keras.optimizers.SGD(learning_rate=0.001)
            dnn = DNN(n1,n2,act,optimizer=opt, epochs=10, batch_size=64)
            dnn.build()
            valid_acc, valid_loss = dnn.train()
            print('\tvalid acc = {}, valid loss = {}'.format(valid_acc, valid_loss))
            if(valid_acc > best_acc):
                best_model = dnn
                best_acc = valid_acc
                best_n1 = n1
                best_n2 = n2
                best_act = act

print('\nThe best model is with n1= {}, n2 = {}, activation function = {}'.format(best_n1,best_n2, best_act))
This is for model n1= 20, n2 = 20, activation function = sigmoid
Epoch 1: train acc= 0.4086, train loss= 1.9378 | valid acc= 0.4068, valid loss= 1.9373
Epoch 2: train acc= 0.5539, train loss= 1.7302 | valid acc= 0.5563, valid loss= 1.7289
Epoch 3: train acc= 0.5677, train loss= 1.6562 | valid acc= 0.5697, valid loss= 1.6547
Epoch 4: train acc= 0.5812, train loss= 1.6286 | valid acc= 0.5822, valid loss= 1.6269
Epoch 5: train acc= 0.5960, train loss= 1.6133 | valid acc= 0.5995, valid loss= 1.6115
Epoch 6: train acc= 0.6110, train loss= 1.6030 | valid acc= 0.6090, valid loss= 1.6011
Epoch 7: train acc= 0.6218, train loss= 1.5955 | valid acc= 0.6195, valid loss= 1.5935
Epoch 8: train acc= 0.6276, train loss= 1.5899 | valid acc= 0.6243, valid loss= 1.5878
Epoch 9: train acc= 0.6314, train loss= 1.5858 | valid acc= 0.6273, valid loss= 1.5837
Epoch 10: train acc= 0.6340, train loss= 1.5828 | valid acc= 0.6305, valid loss= 1.5806
	valid acc = 0.6305, valid loss = 1.5805526927139606

This is for model n1= 20, n2 = 20, activation function = tanh
Epoch 1: train acc= 0.8189, train loss= 0.8983 | valid acc= 0.8262, valid loss= 0.8904
Epoch 2: train acc= 0.8326, train loss= 0.8179 | valid acc= 0.8373, valid loss= 0.8120
Epoch 3: train acc= 0.8372, train loss= 0.7907 | valid acc= 0.8422, valid loss= 0.7857
Epoch 4: train acc= 0.8400, train loss= 0.7792 | valid acc= 0.8445, valid loss= 0.7746
Epoch 5: train acc= 0.8412, train loss= 0.7733 | valid acc= 0.8455, valid loss= 0.7689
Epoch 6: train acc= 0.8424, train loss= 0.7697 | valid acc= 0.8468, valid loss= 0.7655
Epoch 7: train acc= 0.8426, train loss= 0.7674 | valid acc= 0.8472, valid loss= 0.7633
Epoch 8: train acc= 0.8428, train loss= 0.7657 | valid acc= 0.8482, valid loss= 0.7617
Epoch 9: train acc= 0.8434, train loss= 0.7644 | valid acc= 0.8480, valid loss= 0.7605
Epoch 10: train acc= 0.8436, train loss= 0.7634 | valid acc= 0.8483, valid loss= 0.7595
	valid acc = 0.8483333333333334, valid loss = 0.75954389014383

This is for model n1= 20, n2 = 20, activation function = relu
Epoch 1: train acc= 0.8170, train loss= 0.8341 | valid acc= 0.8268, valid loss= 0.8276
Epoch 2: train acc= 0.8306, train loss= 0.7501 | valid acc= 0.8380, valid loss= 0.7463
Epoch 3: train acc= 0.8346, train loss= 0.7235 | valid acc= 0.8392, valid loss= 0.7208
Epoch 4: train acc= 0.8372, train loss= 0.7125 | valid acc= 0.8390, valid loss= 0.7106
Epoch 5: train acc= 0.8389, train loss= 0.7064 | valid acc= 0.8413, valid loss= 0.7046
Epoch 6: train acc= 0.8394, train loss= 0.7040 | valid acc= 0.8418, valid loss= 0.7023
Epoch 7: train acc= 0.8397, train loss= 0.7021 | valid acc= 0.8438, valid loss= 0.7004
Epoch 8: train acc= 0.8396, train loss= 0.7016 | valid acc= 0.8447, valid loss= 0.6999
Epoch 9: train acc= 0.8402, train loss= 0.7012 | valid acc= 0.8432, valid loss= 0.6995
Epoch 10: train acc= 0.8412, train loss= 0.6974 | valid acc= 0.8462, valid loss= 0.6956
	valid acc = 0.8461666666666666, valid loss = 0.6956203009179754

This is for model n1= 20, n2 = 40, activation function = sigmoid
Epoch 1: train acc= 0.4802, train loss= 1.9050 | valid acc= 0.4842, valid loss= 1.9036
Epoch 2: train acc= 0.5258, train loss= 1.6948 | valid acc= 0.5308, valid loss= 1.6902
Epoch 3: train acc= 0.5655, train loss= 1.6388 | valid acc= 0.5692, valid loss= 1.6338
Epoch 4: train acc= 0.5841, train loss= 1.6130 | valid acc= 0.5882, valid loss= 1.6082
Epoch 5: train acc= 0.5939, train loss= 1.5965 | valid acc= 0.5987, valid loss= 1.5924
Epoch 6: train acc= 0.6021, train loss= 1.5845 | valid acc= 0.6070, valid loss= 1.5807
Epoch 7: train acc= 0.6102, train loss= 1.5751 | valid acc= 0.6158, valid loss= 1.5717
Epoch 8: train acc= 0.6185, train loss= 1.5659 | valid acc= 0.6202, valid loss= 1.5628
Epoch 9: train acc= 0.6275, train loss= 1.5558 | valid acc= 0.6260, valid loss= 1.5529
Epoch 10: train acc= 0.6371, train loss= 1.5455 | valid acc= 0.6362, valid loss= 1.5426
	valid acc = 0.6361666666666667, valid loss = 1.542632225107603

This is for model n1= 20, n2 = 40, activation function = tanh
Epoch 1: train acc= 0.8239, train loss= 0.8762 | valid acc= 0.8340, valid loss= 0.8689
Epoch 2: train acc= 0.8367, train loss= 0.7930 | valid acc= 0.8423, valid loss= 0.7863
Epoch 3: train acc= 0.8410, train loss= 0.7651 | valid acc= 0.8448, valid loss= 0.7587
Epoch 4: train acc= 0.8432, train loss= 0.7536 | valid acc= 0.8470, valid loss= 0.7473
Epoch 5: train acc= 0.8439, train loss= 0.7478 | valid acc= 0.8490, valid loss= 0.7416
Epoch 6: train acc= 0.8442, train loss= 0.7443 | valid acc= 0.8497, valid loss= 0.7383
Epoch 7: train acc= 0.8444, train loss= 0.7420 | valid acc= 0.8510, valid loss= 0.7361
Epoch 8: train acc= 0.8448, train loss= 0.7402 | valid acc= 0.8512, valid loss= 0.7345
Epoch 9: train acc= 0.8451, train loss= 0.7388 | valid acc= 0.8513, valid loss= 0.7332
Epoch 10: train acc= 0.8452, train loss= 0.7377 | valid acc= 0.8510, valid loss= 0.7321
	valid acc = 0.851, valid loss = 0.7321288660559229

This is for model n1= 20, n2 = 40, activation function = relu
Epoch 1: train acc= 0.8120, train loss= 0.8586 | valid acc= 0.8227, valid loss= 0.8473
Epoch 2: train acc= 0.8259, train loss= 0.7630 | valid acc= 0.8370, valid loss= 0.7529
Epoch 3: train acc= 0.8310, train loss= 0.7307 | valid acc= 0.8427, valid loss= 0.7214
Epoch 4: train acc= 0.8342, train loss= 0.7166 | valid acc= 0.8443, valid loss= 0.7083
Epoch 5: train acc= 0.8370, train loss= 0.7090 | valid acc= 0.8460, valid loss= 0.7018
Epoch 6: train acc= 0.8390, train loss= 0.7048 | valid acc= 0.8467, valid loss= 0.6986
Epoch 7: train acc= 0.8404, train loss= 0.7022 | valid acc= 0.8463, valid loss= 0.6969
Epoch 8: train acc= 0.8414, train loss= 0.7000 | valid acc= 0.8475, valid loss= 0.6954
Epoch 9: train acc= 0.8422, train loss= 0.6988 | valid acc= 0.8475, valid loss= 0.6947
Epoch 10: train acc= 0.8428, train loss= 0.6977 | valid acc= 0.8463, valid loss= 0.6940
	valid acc = 0.8463333333333334, valid loss = 0.6939578115844145

This is for model n1= 40, n2 = 20, activation function = sigmoid
Epoch 1: train acc= 0.5335, train loss= 1.8716 | valid acc= 0.5398, valid loss= 1.8687
Epoch 2: train acc= 0.5695, train loss= 1.6742 | valid acc= 0.5695, valid loss= 1.6713
Epoch 3: train acc= 0.5835, train loss= 1.6071 | valid acc= 0.5860, valid loss= 1.6042
Epoch 4: train acc= 0.5970, train loss= 1.5741 | valid acc= 0.6035, valid loss= 1.5710
Epoch 5: train acc= 0.6095, train loss= 1.5544 | valid acc= 0.6120, valid loss= 1.5511
Epoch 6: train acc= 0.6196, train loss= 1.5409 | valid acc= 0.6190, valid loss= 1.5375
Epoch 7: train acc= 0.6263, train loss= 1.5311 | valid acc= 0.6290, valid loss= 1.5276
Epoch 8: train acc= 0.6333, train loss= 1.5241 | valid acc= 0.6318, valid loss= 1.5206
Epoch 9: train acc= 0.6401, train loss= 1.5193 | valid acc= 0.6372, valid loss= 1.5158
Epoch 10: train acc= 0.6492, train loss= 1.5159 | valid acc= 0.6450, valid loss= 1.5124
	valid acc = 0.645, valid loss = 1.5124126107648554

This is for model n1= 40, n2 = 20, activation function = tanh
Epoch 1: train acc= 0.8199, train loss= 0.9488 | valid acc= 0.8318, valid loss= 0.9380
Epoch 2: train acc= 0.8319, train loss= 0.8268 | valid acc= 0.8392, valid loss= 0.8179
Epoch 3: train acc= 0.8369, train loss= 0.7848 | valid acc= 0.8425, valid loss= 0.7772
Epoch 4: train acc= 0.8397, train loss= 0.7686 | valid acc= 0.8455, valid loss= 0.7619
Epoch 5: train acc= 0.8416, train loss= 0.7615 | valid acc= 0.8470, valid loss= 0.7554
Epoch 6: train acc= 0.8431, train loss= 0.7577 | valid acc= 0.8475, valid loss= 0.7520
Epoch 7: train acc= 0.8439, train loss= 0.7553 | valid acc= 0.8485, valid loss= 0.7500
Epoch 8: train acc= 0.8446, train loss= 0.7537 | valid acc= 0.8485, valid loss= 0.7486
Epoch 9: train acc= 0.8449, train loss= 0.7525 | valid acc= 0.8500, valid loss= 0.7476
Epoch 10: train acc= 0.8451, train loss= 0.7516 | valid acc= 0.8502, valid loss= 0.7469
	valid acc = 0.8501666666666666, valid loss = 0.7468579011961676

This is for model n1= 40, n2 = 20, activation function = relu
Epoch 1: train acc= 0.8256, train loss= 0.8910 | valid acc= 0.8317, valid loss= 0.8841
Epoch 2: train acc= 0.8363, train loss= 0.7657 | valid acc= 0.8443, valid loss= 0.7600
Epoch 3: train acc= 0.8403, train loss= 0.7246 | valid acc= 0.8475, valid loss= 0.7197
Epoch 4: train acc= 0.8421, train loss= 0.7094 | valid acc= 0.8495, valid loss= 0.7051
Epoch 5: train acc= 0.8429, train loss= 0.7031 | valid acc= 0.8488, valid loss= 0.6995
Epoch 6: train acc= 0.8443, train loss= 0.6994 | valid acc= 0.8510, valid loss= 0.6962
Epoch 7: train acc= 0.8448, train loss= 0.6971 | valid acc= 0.8517, valid loss= 0.6942
Epoch 8: train acc= 0.8452, train loss= 0.6961 | valid acc= 0.8507, valid loss= 0.6935
Epoch 9: train acc= 0.8454, train loss= 0.6949 | valid acc= 0.8512, valid loss= 0.6924
Epoch 10: train acc= 0.8458, train loss= 0.6937 | valid acc= 0.8520, valid loss= 0.6914
	valid acc = 0.852, valid loss = 0.6914116841091157

This is for model n1= 40, n2 = 40, activation function = sigmoid
Epoch 1: train acc= 0.5500, train loss= 1.8929 | valid acc= 0.5500, valid loss= 1.8891
Epoch 2: train acc= 0.5796, train loss= 1.6515 | valid acc= 0.5815, valid loss= 1.6473
Epoch 3: train acc= 0.5964, train loss= 1.5792 | valid acc= 0.5997, valid loss= 1.5746
Epoch 4: train acc= 0.6082, train loss= 1.5494 | valid acc= 0.6098, valid loss= 1.5446
Epoch 5: train acc= 0.6198, train loss= 1.5339 | valid acc= 0.6212, valid loss= 1.5289
Epoch 6: train acc= 0.6316, train loss= 1.5238 | valid acc= 0.6360, valid loss= 1.5188
Epoch 7: train acc= 0.6455, train loss= 1.5157 | valid acc= 0.6468, valid loss= 1.5107
Epoch 8: train acc= 0.6604, train loss= 1.5071 | valid acc= 0.6618, valid loss= 1.5024
Epoch 9: train acc= 0.6738, train loss= 1.4965 | valid acc= 0.6722, valid loss= 1.4925
Epoch 10: train acc= 0.6830, train loss= 1.4858 | valid acc= 0.6805, valid loss= 1.4824
	valid acc = 0.6805, valid loss = 1.4823851180197463

This is for model n1= 40, n2 = 40, activation function = tanh
Epoch 1: train acc= 0.8302, train loss= 0.9330 | valid acc= 0.8393, valid loss= 0.9244
Epoch 2: train acc= 0.8391, train loss= 0.8040 | valid acc= 0.8458, valid loss= 0.7967
Epoch 3: train acc= 0.8428, train loss= 0.7601 | valid acc= 0.8472, valid loss= 0.7535
Epoch 4: train acc= 0.8444, train loss= 0.7431 | valid acc= 0.8487, valid loss= 0.7371
Epoch 5: train acc= 0.8456, train loss= 0.7355 | valid acc= 0.8493, valid loss= 0.7299
Epoch 6: train acc= 0.8462, train loss= 0.7313 | valid acc= 0.8503, valid loss= 0.7259
Epoch 7: train acc= 0.8467, train loss= 0.7286 | valid acc= 0.8508, valid loss= 0.7235
Epoch 8: train acc= 0.8469, train loss= 0.7267 | valid acc= 0.8513, valid loss= 0.7217
Epoch 9: train acc= 0.8475, train loss= 0.7251 | valid acc= 0.8522, valid loss= 0.7203
Epoch 10: train acc= 0.8478, train loss= 0.7237 | valid acc= 0.8525, valid loss= 0.7190
	valid acc = 0.8525, valid loss = 0.7190313472134627

This is for model n1= 40, n2 = 40, activation function = relu
Epoch 1: train acc= 0.8209, train loss= 0.9281 | valid acc= 0.8328, valid loss= 0.9171
Epoch 2: train acc= 0.8326, train loss= 0.7785 | valid acc= 0.8440, valid loss= 0.7703
Epoch 3: train acc= 0.8372, train loss= 0.7293 | valid acc= 0.8467, valid loss= 0.7229
Epoch 4: train acc= 0.8393, train loss= 0.7111 | valid acc= 0.8477, valid loss= 0.7055
Epoch 5: train acc= 0.8405, train loss= 0.7035 | valid acc= 0.8482, valid loss= 0.6980
Epoch 6: train acc= 0.8413, train loss= 0.6995 | valid acc= 0.8488, valid loss= 0.6945
Epoch 7: train acc= 0.8419, train loss= 0.6969 | valid acc= 0.8487, valid loss= 0.6922
Epoch 8: train acc= 0.8432, train loss= 0.6946 | valid acc= 0.8490, valid loss= 0.6902
Epoch 9: train acc= 0.8437, train loss= 0.6931 | valid acc= 0.8497, valid loss= 0.6888
Epoch 10: train acc= 0.8443, train loss= 0.6920 | valid acc= 0.8498, valid loss= 0.6879
	valid acc = 0.8498333333333333, valid loss = 0.6879278224979427

The best model is with n1= 40, n2 = 40, activation function = tanh
In [41]:
best_model.model.save('models/best_cnn.h5')
WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.

**Question 2.5**. Experimenting with sharpness-aware minimization technique¶

[5 points]

Sharpness-aware minimization (SAM) (i.e., link for main paper from Google Deepmind) is a simple yet but efficient technique to improve the generalization ability of deep learning models on unseen data examples. In your research or your work, you might potentially use this idea. Your task is to read the paper and implement Sharpness-aware minimization (SAM). Finally, you need to apply SAM to the best architecture found in Question 2.4.

After using SAM, we can see that our model improved marginally when compared with the accuracies of the best config above.

Best Config (40,40,tanh) Accuracy without SAM:

  • Training Accuracy: 84.78%
  • Validation Accuracy: 85.25%
  • Test Accuracy: 85.25%

Best Config (40,40,tanh) Accuracy with SAM:

  • Training Accuracy: 88.29%
  • Validation Accuracy: 86.47%
  • Test Accuracy: 86.16%

Our model with SAM is able to generalize better as reflected in the training and validation accuracy since we apply the gradients twice at each turn (once with the default values / gradients and another with gradients after applying sharpness scaling).

In [223]:
class SAM(tf.keras.optimizers.Optimizer):
    def __init__(self, base_optimizer, rho=0.05):
        super(SAM, self).__init__(name="SAM")
        self.base_optimizer = base_optimizer
        self.rho = 0.05
        self.epsilon = 1e-12
    
    def apply_gradients (self,zipped_grads_and_trainable_variables):
        # unzip the values taken in from the model
        grads, trainable = zip(*zipped_grads_and_trainable_variables)
        
        #zip it back and pass it into our base optimizer (Adam in this case)
        self.base_optimizer.apply_gradients(zip(grads,trainable))
        
        #initialize sharpness value
        sharpness = 0.0
        #compute the total sharpness value by taking the square of each gradients and reduce it into a single value
        for grad in grads:
            sharpness += tf.reduce_sum(tf.square(grad))
        
        #initalize scaled_gradients array
        scaled_grad = []
        #apply sharpness to each gradient value
        for grad in grads:
            scaled_grad.append(self.rho * grad / (sharpness + self.epsilon))
        
        #reapply gradients with the base optimizer with the sharpness adjusted values
        self.base_optimizer.apply_gradients(zip(scaled_grad,trainable))
        
sam = SAM(opt)
opt = tf.keras.optimizers.Adam()
dnn = DNN(40,40,'tanh',optimizer=sam, epochs=50, batch_size=32)
dnn.build()
dnn.train()
dnn.evaluate_test_set()
        
        
Epoch 1: train acc= 0.8462, train loss= 0.5069 | valid acc= 0.8380, valid loss= 0.5244
Epoch 2: train acc= 0.8607, train loss= 0.4787 | valid acc= 0.8503, valid loss= 0.5011
Epoch 3: train acc= 0.8665, train loss= 0.4675 | valid acc= 0.8552, valid loss= 0.4931
Epoch 4: train acc= 0.8694, train loss= 0.4604 | valid acc= 0.8555, valid loss= 0.4882
Epoch 5: train acc= 0.8709, train loss= 0.4574 | valid acc= 0.8582, valid loss= 0.4876
Epoch 6: train acc= 0.8725, train loss= 0.4542 | valid acc= 0.8590, valid loss= 0.4862
Epoch 7: train acc= 0.8724, train loss= 0.4551 | valid acc= 0.8582, valid loss= 0.4887
Epoch 8: train acc= 0.8725, train loss= 0.4551 | valid acc= 0.8582, valid loss= 0.4901
Epoch 9: train acc= 0.8746, train loss= 0.4503 | valid acc= 0.8605, valid loss= 0.4855
Epoch 10: train acc= 0.8751, train loss= 0.4510 | valid acc= 0.8605, valid loss= 0.4865
Epoch 11: train acc= 0.8760, train loss= 0.4496 | valid acc= 0.8617, valid loss= 0.4848
Epoch 12: train acc= 0.8762, train loss= 0.4482 | valid acc= 0.8630, valid loss= 0.4836
Epoch 13: train acc= 0.8771, train loss= 0.4483 | valid acc= 0.8628, valid loss= 0.4825
Epoch 14: train acc= 0.8758, train loss= 0.4510 | valid acc= 0.8597, valid loss= 0.4860
Epoch 15: train acc= 0.8776, train loss= 0.4484 | valid acc= 0.8617, valid loss= 0.4842
Epoch 16: train acc= 0.8778, train loss= 0.4465 | valid acc= 0.8623, valid loss= 0.4826
Epoch 17: train acc= 0.8796, train loss= 0.4409 | valid acc= 0.8623, valid loss= 0.4777
Epoch 18: train acc= 0.8803, train loss= 0.4405 | valid acc= 0.8638, valid loss= 0.4774
Epoch 19: train acc= 0.8801, train loss= 0.4418 | valid acc= 0.8633, valid loss= 0.4792
Epoch 20: train acc= 0.8805, train loss= 0.4405 | valid acc= 0.8637, valid loss= 0.4780
Epoch 21: train acc= 0.8815, train loss= 0.4393 | valid acc= 0.8628, valid loss= 0.4775
Epoch 22: train acc= 0.8814, train loss= 0.4401 | valid acc= 0.8648, valid loss= 0.4783
Epoch 23: train acc= 0.8818, train loss= 0.4391 | valid acc= 0.8640, valid loss= 0.4774
Epoch 24: train acc= 0.8816, train loss= 0.4390 | valid acc= 0.8653, valid loss= 0.4779
Epoch 25: train acc= 0.8820, train loss= 0.4383 | valid acc= 0.8655, valid loss= 0.4771
Epoch 26: train acc= 0.8817, train loss= 0.4373 | valid acc= 0.8660, valid loss= 0.4765
Epoch 27: train acc= 0.8823, train loss= 0.4361 | valid acc= 0.8653, valid loss= 0.4758
Epoch 28: train acc= 0.8823, train loss= 0.4363 | valid acc= 0.8662, valid loss= 0.4758
Epoch 29: train acc= 0.8833, train loss= 0.4341 | valid acc= 0.8665, valid loss= 0.4742
Epoch 30: train acc= 0.8826, train loss= 0.4358 | valid acc= 0.8657, valid loss= 0.4762
Epoch 31: train acc= 0.8832, train loss= 0.4335 | valid acc= 0.8668, valid loss= 0.4739
Epoch 32: train acc= 0.8827, train loss= 0.4345 | valid acc= 0.8665, valid loss= 0.4757
Epoch 33: train acc= 0.8823, train loss= 0.4349 | valid acc= 0.8658, valid loss= 0.4762
Epoch 34: train acc= 0.8841, train loss= 0.4320 | valid acc= 0.8683, valid loss= 0.4740
Epoch 35: train acc= 0.8850, train loss= 0.4310 | valid acc= 0.8665, valid loss= 0.4732
Epoch 36: train acc= 0.8837, train loss= 0.4327 | valid acc= 0.8667, valid loss= 0.4753
Epoch 37: train acc= 0.8845, train loss= 0.4306 | valid acc= 0.8682, valid loss= 0.4734
Epoch 38: train acc= 0.8836, train loss= 0.4328 | valid acc= 0.8668, valid loss= 0.4755
Epoch 39: train acc= 0.8843, train loss= 0.4307 | valid acc= 0.8675, valid loss= 0.4735
Epoch 40: train acc= 0.8848, train loss= 0.4294 | valid acc= 0.8660, valid loss= 0.4718
Epoch 41: train acc= 0.8846, train loss= 0.4307 | valid acc= 0.8650, valid loss= 0.4735
Epoch 42: train acc= 0.8836, train loss= 0.4318 | valid acc= 0.8658, valid loss= 0.4750
Epoch 43: train acc= 0.8839, train loss= 0.4317 | valid acc= 0.8653, valid loss= 0.4754
Epoch 44: train acc= 0.8841, train loss= 0.4312 | valid acc= 0.8657, valid loss= 0.4748
Epoch 45: train acc= 0.8832, train loss= 0.4332 | valid acc= 0.8638, valid loss= 0.4768
Epoch 46: train acc= 0.8835, train loss= 0.4337 | valid acc= 0.8630, valid loss= 0.4770
Epoch 47: train acc= 0.8835, train loss= 0.4315 | valid acc= 0.8652, valid loss= 0.4751
Epoch 48: train acc= 0.8836, train loss= 0.4318 | valid acc= 0.8652, valid loss= 0.4751
Epoch 49: train acc= 0.8838, train loss= 0.4322 | valid acc= 0.8653, valid loss= 0.4757
Epoch 50: train acc= 0.8829, train loss= 0.4339 | valid acc= 0.8647, valid loss= 0.4777
Out[223]:
'test acc = 0.8616, test loss = 0.4981'

Part 3: Convolutional Neural Networks and Image Classification¶

[Total marks for this part: 40 points]

This part of the asssignment is designed to assess your knowledge and coding skill with Tensorflow as well as hands-on experience with training Convolutional Neural Network (CNN).

The dataset used for this part is a specific dataset for this unit consisting of approximately $10,000$ images of $20$ classes, each of which has approximately 500 images. You can download the dataset at download here and then decompress to the folder datasets\FIT5215_Dataset in your assignment folder.

Your task is to build a CNN model using TF 2.x to classify the images. You're provided with the module models.py, which you can find in the assignment folder, with some of the following classes:

  1. DatasetManager: Support with loading and spliting the dataset into the train-val-test sets. It also supports generating next batches for training. DatasetManager will be passed to CNN model for training and testing.
  2. DefaultModel: A base class for the CNN model.
  3. YourModel: The class you'll need to implement for building your CNN model. It inherits some useful attributes and functions from the base class DefaultModel
  4. Note that you can freely modify the models.py file for your purposes.

Firstly, we need to run the following cells to load and preprocess the FIT5215 dataset.

In [4]:
%load_ext autoreload
%autoreload 2

Install the package imutils if you have not installed yet

In [5]:
! pip install imutils
Requirement already satisfied: imutils in c:\users\manut\anaconda3\envs\gpu\lib\site-packages (0.5.4)
In [6]:
import os
import matplotlib.pyplot as plt
plt.style.use('ggplot')
%matplotlib inline
import models
from models import SimplePreprocessor, DatasetManager, DefaultModel
In [7]:
def create_label_folder_dict(adir):
    sub_folders= [folder for folder in os.listdir(adir)
                  if os.path.isdir(os.path.join(adir, folder))]
    label_folder_dict= dict()
    for folder in sub_folders:
        item= {folder: os.path.abspath(os.path.join(adir, folder))}
        label_folder_dict.update(item)
    return label_folder_dict
In [8]:
label_folder_dict= create_label_folder_dict("./datasets/FIT5215_Dataset")

The below code helps to create a data manager that contains all relevant methods used to manage and process the experimental data.

In [9]:
sp = SimplePreprocessor(width=32, height=32)
data_manager = DatasetManager([sp])
data_manager.load(label_folder_dict, verbose=100)
data_manager.process_data_label()
data_manager.train_valid_test_split()
birds 512
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
bottles 432
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
breads 432
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
butterfiles 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
cakes 432
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
cats 501
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
chickens 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
cows 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
dogs 501
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
ducks 496
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
elephants 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
fishes 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
handguns 448
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
horses 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
lions 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
lipsticks 400
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
seals 448
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
snakes 496
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
spiders 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
vases 368
Processed 100/500
Processed 200/500
Processed 300/500

Note that the object data_manager has the attributes relating to the training, validation, and testing sets as shown belows. You can use them in training your developped models in the sequel.

In [10]:
print(data_manager.X_train.shape, data_manager.y_train.shape)
print(data_manager.X_valid.shape, data_manager.y_valid.shape)
print(data_manager.X_test.shape, data_manager.y_test.shape)
print(data_manager.classes)
(7560, 32, 32, 3) (7560,)
(946, 32, 32, 3) (946,)
(946, 32, 32, 3) (946,)
['birds' 'bottles' 'breads' 'butterfiles' 'cakes' 'cats' 'chickens' 'cows'
 'dogs' 'ducks' 'elephants' 'fishes' 'handguns' 'horses' 'lions'
 'lipsticks' 'seals' 'snakes' 'spiders' 'vases']

We now run the default model built in the models.py file which serves as a basic baseline to start the investigation. Follow the following steps to realize how to run a model and know the built-in methods associated to a model developped in the DefaultModel class.

We first initialize a default model from the DefaultModel class. Basically, we can define the relevant parameters of training a model including num_classes, optimizer, learning_rate, batch_size, and num_epochs.

In [33]:
network1 = DefaultModel(name='network1',
                       num_classes=len(data_manager.classes),
                       optimizer='sgd',
                       batch_size= 128,
                       num_epochs = 20,
                       learning_rate=0.1)

The method build_cnn() assists us in building your convolutional neural network. You can view the code (in the models.py file) of the model behind a default model to realize how simple it is. Additionally, the method summary() shows the architecture of a model.

In [34]:
network1.build_cnn()
network1.summary()
Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 32, 32, 32)        896       
                                                                 
 conv2d_1 (Conv2D)           (None, 32, 32, 32)        9248      
                                                                 
 average_pooling2d (AverageP  (None, 16, 16, 32)       0         
 ooling2D)                                                       
                                                                 
 conv2d_2 (Conv2D)           (None, 16, 16, 64)        18496     
                                                                 
 conv2d_3 (Conv2D)           (None, 16, 16, 64)        36928     
                                                                 
 average_pooling2d_1 (Averag  (None, 8, 8, 64)         0         
 ePooling2D)                                                     
                                                                 
 flatten (Flatten)           (None, 4096)              0         
                                                                 
 dense_3 (Dense)             (None, 20)                81940     
                                                                 
=================================================================
Total params: 147,508
Trainable params: 147,508
Non-trainable params: 0
_________________________________________________________________
None

To train a model regarding to the datasets stored in data_manager, you can invoke the method fit() for which you can specify the batch size and number of epochs for your training.

In [35]:
network1.fit(data_manager, batch_size = 64, num_epochs = 20)
Epoch 1/20
119/119 [==============================] - 10s 42ms/step - loss: 2.8829 - accuracy: 0.1139 - val_loss: 2.9179 - val_accuracy: 0.1226
Epoch 2/20
119/119 [==============================] - 5s 40ms/step - loss: 2.5422 - accuracy: 0.2311 - val_loss: 2.8218 - val_accuracy: 0.1797
Epoch 3/20
119/119 [==============================] - 5s 39ms/step - loss: 2.3153 - accuracy: 0.2954 - val_loss: 3.2079 - val_accuracy: 0.1723
Epoch 4/20
119/119 [==============================] - 5s 39ms/step - loss: 2.2063 - accuracy: 0.3272 - val_loss: 3.3495 - val_accuracy: 0.1734
Epoch 5/20
119/119 [==============================] - 5s 40ms/step - loss: 2.0907 - accuracy: 0.3567 - val_loss: 2.5826 - val_accuracy: 0.2093
Epoch 6/20
119/119 [==============================] - 5s 40ms/step - loss: 1.9664 - accuracy: 0.3976 - val_loss: 2.9541 - val_accuracy: 0.2178
Epoch 7/20
119/119 [==============================] - 5s 40ms/step - loss: 1.8645 - accuracy: 0.4266 - val_loss: 2.5988 - val_accuracy: 0.2347
Epoch 8/20
119/119 [==============================] - 5s 40ms/step - loss: 1.7803 - accuracy: 0.4540 - val_loss: 2.3764 - val_accuracy: 0.2801
Epoch 9/20
119/119 [==============================] - 5s 40ms/step - loss: 1.6480 - accuracy: 0.4926 - val_loss: 6.8173 - val_accuracy: 0.1564
Epoch 10/20
119/119 [==============================] - 5s 40ms/step - loss: 1.7674 - accuracy: 0.4706 - val_loss: 2.4231 - val_accuracy: 0.3277
Epoch 11/20
119/119 [==============================] - 5s 41ms/step - loss: 1.4942 - accuracy: 0.5414 - val_loss: 2.8209 - val_accuracy: 0.2928
Epoch 12/20
119/119 [==============================] - 5s 41ms/step - loss: 1.3854 - accuracy: 0.5694 - val_loss: 2.6143 - val_accuracy: 0.3510
Epoch 13/20
119/119 [==============================] - 5s 40ms/step - loss: 1.2694 - accuracy: 0.6041 - val_loss: 2.4816 - val_accuracy: 0.3552
Epoch 14/20
119/119 [==============================] - 5s 41ms/step - loss: 1.1388 - accuracy: 0.6496 - val_loss: 2.6341 - val_accuracy: 0.3510
Epoch 15/20
119/119 [==============================] - 5s 41ms/step - loss: 1.0294 - accuracy: 0.6776 - val_loss: 2.6812 - val_accuracy: 0.3721
Epoch 16/20
119/119 [==============================] - 5s 41ms/step - loss: 0.9249 - accuracy: 0.7063 - val_loss: 2.9345 - val_accuracy: 0.3689
Epoch 17/20
119/119 [==============================] - 5s 42ms/step - loss: 0.8156 - accuracy: 0.7398 - val_loss: 3.0814 - val_accuracy: 0.3615
Epoch 18/20
119/119 [==============================] - 5s 43ms/step - loss: 0.7208 - accuracy: 0.7713 - val_loss: 3.7819 - val_accuracy: 0.3319
Epoch 19/20
119/119 [==============================] - 5s 42ms/step - loss: 0.6296 - accuracy: 0.8017 - val_loss: 4.1202 - val_accuracy: 0.3214
Epoch 20/20
119/119 [==============================] - 5s 42ms/step - loss: 0.5418 - accuracy: 0.8250 - val_loss: 4.7613 - val_accuracy: 0.3129

Here you can compute the accuracy of your trained model with respect to a separate testing set.

In [36]:
network1.compute_accuracy(data_manager.X_test, data_manager.y_test)
15/15 [==============================] - 0s 13ms/step - loss: 4.8948 - accuracy: 0.3055
Out[36]:
0.3054968287526427

Below shows how you can inspect the training progress.

In [37]:
network1.plot_progress()

You can use the method predict() to predict labels for data examples in a test set.

In [38]:
network1.predict(data_manager.X_test[0:10])
1/1 [==============================] - 0s 103ms/step
Out[38]:
array([ 2, 11,  9, 18, 15,  9,  5,  0,  9, 10], dtype=int64)

Finally, the method plot_prediction() visualizes the predictions for a test set in which several images are chosen to show the predictions.

In [39]:
network1.plot_prediction(data_manager.X_test, data_manager.y_test, data_manager.classes)
30/30 [==============================] - 0s 7ms/step
<Figure size 640x480 with 0 Axes>

For questions 3.1 to 3.7, you'll need to write your own model in a way that makes it easy for you to experiment with different architectures and parameters. The goal is to be able to pass the parameters to initialize a new instance of YourModel to build different network architectures with different parameters. Below are descriptions of some parameters for YourModel:

  1. Block architecture: Each block has the pattern [conv, batch norm, activation, conv, batch norm, activation, mean pool]. All convolutional layers have filter size $(3, 3)$, strides $(1, 1)$ and 'SAME' padding, and all mean pool layers have strides $(2, 2)$ and 'SAME' padding. The network will consists of a few blocks before applying a global average pooling (GAP) layer to obtain vectors and then a dense layer to output the logits for the softmax layer.

When designing a block, there must have some instance variables as follows

  1. num_channels: the number of channels used in a block, which will be applied to two Convs in the block.

  2. mean_pool (True, False): the mean pool is used not. If mean_pool = True, it is used to downsample the input by two.

  3. batch_norm (True, False): the batch normalization function is used or not. Setting batch_norm to False means not using batch normalization.

  4. The skip connection (True, False) is added to the output of the second batch norm. Additionally, your class has a boolean property (i.e., instance variable) named use_skip. If use_skip=True, the skip connectnion is enable. Otherwise, if use_skip=False, the skip connectnion is disable.

Below is the architecture of one block:

Below is the architecture of the entire deep net with two blocks:

The above network has two blocks with the numbers of channels are 16 and 32 respectively. We apply a global average pooling (GAP) layer to flattern the output of the last block, followed by an output layer for prediction.

In [1]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models
from tensorflow.keras.layers import GlobalAveragePooling2D
import numpy as np
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import EarlyStopping
In [2]:
tf.random.set_seed(1234)

**Question 3.1** Write the code of the YourModel class here. Note that this class will inherit from the DefaultModel class. You'll only need to re-write the code for the build_cnn method in the YourModel class from the cell below. Note that the YourModel class is inherited from the DefaultModel class.

[6 points]

Baseline Accuracy:

  • Training Accuracy: 66.75%
  • Validation Accuracy: 57.72%
  • Test Accuracy: 55.60%
In [42]:
# model class adapted from tutorials 
class YourModel(DefaultModel):
    def __init__(self,num_channels,blocks,mean_pool,batch_norm,use_skip,learning_rate,verbose,
                 name='network1',
                 width=32, height=32, depth=3,
                 num_classes=20, 
                 is_augmentation = False,
                 activation_func='relu',
                 optimizer='adam',
                 batch_size=128,
                 num_epochs= 30):
        super(YourModel, self).__init__(name, width, height, depth, num_classes, is_augmentation, 
                                        activation_func, optimizer, batch_size, num_epochs, 
                                        learning_rate, verbose)
        self.num_channels = num_channels
        self.mean_pool = mean_pool
        self.batch_norm = batch_norm
        self.use_skip = use_skip
        self.blocks = blocks
    
    def build_cnn(self,x):
        # Resblock code for builing each block
        # x1 takes the inputs of x and passes it through a conv layer
        x1 = layers.Conv2D(self.num_channels, (3, 3), strides=(1, 1), padding='same')(x)
        # if batch_norm is true, pass x1 through a batch norm layer
        if self.batch_norm:
            x1 = layers.BatchNormalization() (x1)
        # apply activation function to x1
        x1 = layers.Activation('relu') (x1)
        # pass values of x1 into a conv layer
        x2 = layers.Conv2D(self.num_channels, (3, 3), strides=(1, 1), padding='same')(x1)
        #if value of x1 is
        if self.batch_norm:
                x2 = layers.BatchNormalization()(x2)
        if x.shape != x2.shape:
            if x2.shape[3] > x.shape[3]:
                pad_tns = tf.constant([[0,0],[0, 0] ,[0,0], [x2.shape[3] - x.shape[3] , 0]])
                x = tf.pad(x, pad_tns, mode ='CONSTANT', constant_values=0)
            else:
                pad_tns = tf.constant([[0,0],[0, 0] ,[0,0], [x.shape[3] - x2.shape[3] , 0]])
                x = tf.pad(x2, pad_tns, mode ='CONSTANT', constant_values=0)
        x2_skip = layers.add([x, x2])
        x2_skip = layers.Activation('relu')(x2_skip)
        if self.mean_pool:
            output_layer = layers.AveragePooling2D(pool_size=(2, 2), padding='same')(x2_skip)
        else: 
            output_layer = x2_skip
        return output_layer
    
    def build_resnet(self):
        self.input_layer = layers.Input(shape=(self.width, self.height, self.depth))
        x = self.input_layer
        for i in range (self.blocks):
            x = self.build_cnn(x)
            self.num_channels = self.num_channels*2
            
        output_layer = GlobalAveragePooling2D()(x)
        output_layer = layers.Flatten()(output_layer)
        output_layer = layers.Dense(self.num_classes, activation='softmax')(output_layer)
        self.model = tf.keras.models.Model(inputs=self.input_layer, outputs=output_layer)
        self.model.compile(optimizer=self.optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
        
        

Now run your model with a specific configuration.

In [43]:
#Your run here
# num_channels,blocks,mean_pool,batch_norm,use_skip,learning_rate,verbose
testModel = YourModel(16,2,True,True,True,0.001,True)
testModel.build_resnet()
testModel.summary()
testModel.fit(data_manager, batch_size = 16, num_epochs = 20)
Model: "model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 input_1 (InputLayer)           [(None, 32, 32, 3)]  0           []                               
                                                                                                  
 conv2d_4 (Conv2D)              (None, 32, 32, 32)   896         ['input_1[0][0]']                
                                                                                                  
 batch_normalization (BatchNorm  (None, 32, 32, 32)  128         ['conv2d_4[0][0]']               
 alization)                                                                                       
                                                                                                  
 activation (Activation)        (None, 32, 32, 32)   0           ['batch_normalization[0][0]']    
                                                                                                  
 conv2d_5 (Conv2D)              (None, 32, 32, 32)   9248        ['activation[0][0]']             
                                                                                                  
 tf.compat.v1.pad (TFOpLambda)  (None, 32, 32, 32)   0           ['input_1[0][0]']                
                                                                                                  
 batch_normalization_1 (BatchNo  (None, 32, 32, 32)  128         ['conv2d_5[0][0]']               
 rmalization)                                                                                     
                                                                                                  
 add (Add)                      (None, 32, 32, 32)   0           ['tf.compat.v1.pad[0][0]',       
                                                                  'batch_normalization_1[0][0]']  
                                                                                                  
 activation_1 (Activation)      (None, 32, 32, 32)   0           ['add[0][0]']                    
                                                                                                  
 average_pooling2d_2 (AveragePo  (None, 16, 16, 32)  0           ['activation_1[0][0]']           
 oling2D)                                                                                         
                                                                                                  
 conv2d_6 (Conv2D)              (None, 16, 16, 64)   18496       ['average_pooling2d_2[0][0]']    
                                                                                                  
 batch_normalization_2 (BatchNo  (None, 16, 16, 64)  256         ['conv2d_6[0][0]']               
 rmalization)                                                                                     
                                                                                                  
 activation_2 (Activation)      (None, 16, 16, 64)   0           ['batch_normalization_2[0][0]']  
                                                                                                  
 conv2d_7 (Conv2D)              (None, 16, 16, 64)   36928       ['activation_2[0][0]']           
                                                                                                  
 tf.compat.v1.pad_1 (TFOpLambda  (None, 16, 16, 64)  0           ['average_pooling2d_2[0][0]']    
 )                                                                                                
                                                                                                  
 batch_normalization_3 (BatchNo  (None, 16, 16, 64)  256         ['conv2d_7[0][0]']               
 rmalization)                                                                                     
                                                                                                  
 add_1 (Add)                    (None, 16, 16, 64)   0           ['tf.compat.v1.pad_1[0][0]',     
                                                                  'batch_normalization_3[0][0]']  
                                                                                                  
 activation_3 (Activation)      (None, 16, 16, 64)   0           ['add_1[0][0]']                  
                                                                                                  
 average_pooling2d_3 (AveragePo  (None, 8, 8, 64)    0           ['activation_3[0][0]']           
 oling2D)                                                                                         
                                                                                                  
 global_average_pooling2d (Glob  (None, 64)          0           ['average_pooling2d_3[0][0]']    
 alAveragePooling2D)                                                                              
                                                                                                  
 flatten_1 (Flatten)            (None, 64)           0           ['global_average_pooling2d[0][0]'
                                                                 ]                                
                                                                                                  
 dense_4 (Dense)                (None, 20)           1300        ['flatten_1[0][0]']              
                                                                                                  
==================================================================================================
Total params: 67,636
Trainable params: 67,252
Non-trainable params: 384
__________________________________________________________________________________________________
None
Epoch 1/20
473/473 [==============================] - 9s 16ms/step - loss: 2.5454 - accuracy: 0.2274 - val_loss: 2.4879 - val_accuracy: 0.2357
Epoch 2/20
473/473 [==============================] - 7s 15ms/step - loss: 2.2405 - accuracy: 0.3165 - val_loss: 2.2155 - val_accuracy: 0.3087
Epoch 3/20
473/473 [==============================] - 7s 15ms/step - loss: 2.0808 - accuracy: 0.3583 - val_loss: 2.4101 - val_accuracy: 0.2558
Epoch 4/20
473/473 [==============================] - 7s 15ms/step - loss: 1.9518 - accuracy: 0.3942 - val_loss: 1.9236 - val_accuracy: 0.3932
Epoch 5/20
473/473 [==============================] - 7s 15ms/step - loss: 1.8321 - accuracy: 0.4283 - val_loss: 1.7975 - val_accuracy: 0.4197
Epoch 6/20
473/473 [==============================] - 7s 15ms/step - loss: 1.7244 - accuracy: 0.4655 - val_loss: 1.7212 - val_accuracy: 0.4641
Epoch 7/20
473/473 [==============================] - 7s 16ms/step - loss: 1.6426 - accuracy: 0.4907 - val_loss: 1.8694 - val_accuracy: 0.4123
Epoch 8/20
473/473 [==============================] - 8s 16ms/step - loss: 1.5542 - accuracy: 0.5144 - val_loss: 1.5336 - val_accuracy: 0.5402
Epoch 9/20
473/473 [==============================] - 8s 16ms/step - loss: 1.4966 - accuracy: 0.5282 - val_loss: 1.7946 - val_accuracy: 0.4450
Epoch 10/20
473/473 [==============================] - 7s 15ms/step - loss: 1.4419 - accuracy: 0.5516 - val_loss: 2.1555 - val_accuracy: 0.3552
Epoch 11/20
473/473 [==============================] - 7s 15ms/step - loss: 1.3794 - accuracy: 0.5698 - val_loss: 1.5656 - val_accuracy: 0.5116
Epoch 12/20
473/473 [==============================] - 7s 15ms/step - loss: 1.3423 - accuracy: 0.5742 - val_loss: 1.4894 - val_accuracy: 0.5349
Epoch 13/20
473/473 [==============================] - 7s 15ms/step - loss: 1.2964 - accuracy: 0.5955 - val_loss: 1.5869 - val_accuracy: 0.4979
Epoch 14/20
473/473 [==============================] - 7s 16ms/step - loss: 1.2537 - accuracy: 0.6050 - val_loss: 1.4339 - val_accuracy: 0.5507
Epoch 15/20
473/473 [==============================] - 7s 16ms/step - loss: 1.2226 - accuracy: 0.6147 - val_loss: 1.4221 - val_accuracy: 0.5412
Epoch 16/20
473/473 [==============================] - 7s 16ms/step - loss: 1.1923 - accuracy: 0.6282 - val_loss: 1.4886 - val_accuracy: 0.5370
Epoch 17/20
473/473 [==============================] - 7s 16ms/step - loss: 1.1622 - accuracy: 0.6349 - val_loss: 1.3364 - val_accuracy: 0.5920
Epoch 18/20
473/473 [==============================] - 8s 16ms/step - loss: 1.1086 - accuracy: 0.6495 - val_loss: 1.3345 - val_accuracy: 0.5867
Epoch 19/20
473/473 [==============================] - 8s 16ms/step - loss: 1.0759 - accuracy: 0.6632 - val_loss: 1.3187 - val_accuracy: 0.5888
Epoch 20/20
473/473 [==============================] - 8s 16ms/step - loss: 1.0528 - accuracy: 0.6675 - val_loss: 1.4775 - val_accuracy: 0.5772
In [44]:
testModel.compute_accuracy(data_manager.X_test, data_manager.y_test)
testModel.plot_progress()
testModel.plot_prediction(data_manager.X_test, data_manager.y_test, data_manager.classes)
15/15 [==============================] - 0s 17ms/step - loss: 1.5110 - accuracy: 0.5560
30/30 [==============================] - 0s 7ms/step
<Figure size 640x480 with 0 Axes>
In [45]:
#Save baseline model
testModel.model.save('models/base_dnn.h5')

**Question 3.2** Now, let us tune the number of blocks $num\_blocks \in \{3,4\}$, $use\_skip \in \{True, False\}$, $mean\_pool \in \{True, False\}$, and $learning\_rate \in \{0.001, 0.0001\}$. Write your code for this tuning and report the result of the best model on the testing set. Note that you need to show your code for tuning and evaluating on the test set to earn the full marks. During tuning, you can set the instance variable verbose of your model to False for not showing the training details of each epoch.

[4 points]

Report the best parameters and the testing accuracy here¶

Best Config:

Blocks Skip Pool Rate Accuracy
4 True True 0.001 58.77%

After testing all configurations, we can conclude that this is the best set of instance variables and will now be used for all subsequent questions

In [223]:
#Insert your code here. You can add more cells if necessary
num_blocks = [3,4]
use_skip = [True,False]
mean_pool = [True,False]
learning_rate = [0.001,0.0001]
best_accuracy = -float('inf')
bestModel = None

# num_channels,blocks,mean_pool,batch_norm,use_skip,learning_rate,verbose

for blocks in num_blocks:
    for skip in use_skip:
        for pool in mean_pool:
            for rate in learning_rate:
                testModel = YourModel(16,blocks,pool,True,skip,rate,True)
                testModel.build_resnet()
                testModel.fit(data_manager, batch_size = 64, num_epochs = 30)
                acc = testModel.compute_accuracy(data_manager.X_test, data_manager.y_test)
                if acc > best_accuracy:
                    best_model = testModel
                    best_accuracy = acc
                    best_config = (blocks,skip,pool,rate,acc)
                    
best_model.model.save('models/best_base_dnn_config.h5')
                
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16)
(None, 32, 32, 16)
(None, 16, 16, 16) (None, 16, 16, 32)
(None, 16, 16, 32)
(None, 8, 8, 32) (None, 8, 8, 64)
(None, 8, 8, 64)
Epoch 1/30
119/119 [==============================] - 13s 105ms/step - loss: 2.4923 - accuracy: 0.2440 - val_loss: 3.0471 - val_accuracy: 0.1025
Epoch 2/30
119/119 [==============================] - 12s 101ms/step - loss: 2.1277 - accuracy: 0.3467 - val_loss: 3.2528 - val_accuracy: 0.1279
Epoch 3/30
119/119 [==============================] - 12s 98ms/step - loss: 1.9461 - accuracy: 0.4070 - val_loss: 2.8097 - val_accuracy: 0.1977
Epoch 4/30
119/119 [==============================] - 12s 99ms/step - loss: 1.8313 - accuracy: 0.4411 - val_loss: 2.1317 - val_accuracy: 0.3414
Epoch 5/30
119/119 [==============================] - 12s 100ms/step - loss: 1.7085 - accuracy: 0.4763 - val_loss: 1.9802 - val_accuracy: 0.4038
Epoch 6/30
119/119 [==============================] - 12s 101ms/step - loss: 1.6147 - accuracy: 0.5029 - val_loss: 1.8196 - val_accuracy: 0.4366
Epoch 7/30
119/119 [==============================] - 12s 99ms/step - loss: 1.5251 - accuracy: 0.5347 - val_loss: 1.9529 - val_accuracy: 0.4186
Epoch 8/30
119/119 [==============================] - 12s 102ms/step - loss: 1.4721 - accuracy: 0.5446 - val_loss: 1.8760 - val_accuracy: 0.4207
Epoch 9/30
119/119 [==============================] - 12s 101ms/step - loss: 1.3922 - accuracy: 0.5731 - val_loss: 1.7467 - val_accuracy: 0.4609
Epoch 10/30
119/119 [==============================] - 12s 99ms/step - loss: 1.3194 - accuracy: 0.5939 - val_loss: 1.6478 - val_accuracy: 0.4799
Epoch 11/30
119/119 [==============================] - 13s 108ms/step - loss: 1.2437 - accuracy: 0.6148 - val_loss: 1.7037 - val_accuracy: 0.4820
Epoch 12/30
119/119 [==============================] - 12s 103ms/step - loss: 1.2083 - accuracy: 0.6239 - val_loss: 1.7971 - val_accuracy: 0.4609
Epoch 13/30
119/119 [==============================] - 13s 111ms/step - loss: 1.1413 - accuracy: 0.6431 - val_loss: 1.6978 - val_accuracy: 0.4736
Epoch 14/30
119/119 [==============================] - 12s 102ms/step - loss: 1.1088 - accuracy: 0.6541 - val_loss: 1.7140 - val_accuracy: 0.4704
Epoch 15/30
119/119 [==============================] - 12s 100ms/step - loss: 1.0597 - accuracy: 0.6724 - val_loss: 1.7007 - val_accuracy: 0.5053
Epoch 16/30
119/119 [==============================] - 12s 100ms/step - loss: 1.0065 - accuracy: 0.6817 - val_loss: 2.1691 - val_accuracy: 0.3827
Epoch 17/30
119/119 [==============================] - 12s 101ms/step - loss: 0.9765 - accuracy: 0.6948 - val_loss: 1.6123 - val_accuracy: 0.5275
Epoch 18/30
119/119 [==============================] - 12s 102ms/step - loss: 0.9267 - accuracy: 0.7132 - val_loss: 1.5606 - val_accuracy: 0.5254
Epoch 19/30
119/119 [==============================] - 12s 102ms/step - loss: 0.8760 - accuracy: 0.7284 - val_loss: 1.6391 - val_accuracy: 0.4884
Epoch 20/30
119/119 [==============================] - 12s 101ms/step - loss: 0.8644 - accuracy: 0.7283 - val_loss: 1.6272 - val_accuracy: 0.5328
Epoch 21/30
119/119 [==============================] - 12s 102ms/step - loss: 0.8107 - accuracy: 0.7475 - val_loss: 1.7318 - val_accuracy: 0.5106
Epoch 22/30
119/119 [==============================] - 12s 101ms/step - loss: 0.7856 - accuracy: 0.7574 - val_loss: 1.6059 - val_accuracy: 0.4968
Epoch 23/30
119/119 [==============================] - 12s 102ms/step - loss: 0.7273 - accuracy: 0.7800 - val_loss: 1.6060 - val_accuracy: 0.5391
Epoch 24/30
119/119 [==============================] - 12s 101ms/step - loss: 0.6959 - accuracy: 0.7876 - val_loss: 1.9895 - val_accuracy: 0.4683
Epoch 25/30
119/119 [==============================] - 12s 102ms/step - loss: 0.6657 - accuracy: 0.7954 - val_loss: 2.0533 - val_accuracy: 0.4545
Epoch 26/30
119/119 [==============================] - 12s 104ms/step - loss: 0.6310 - accuracy: 0.8049 - val_loss: 1.6096 - val_accuracy: 0.5307
Epoch 27/30
119/119 [==============================] - 12s 102ms/step - loss: 0.6205 - accuracy: 0.8063 - val_loss: 1.7208 - val_accuracy: 0.5021
Epoch 28/30
119/119 [==============================] - 13s 106ms/step - loss: 0.5879 - accuracy: 0.8222 - val_loss: 1.6549 - val_accuracy: 0.5359
Epoch 29/30
119/119 [==============================] - 15s 127ms/step - loss: 0.5777 - accuracy: 0.8200 - val_loss: 1.7144 - val_accuracy: 0.5539
Epoch 30/30
119/119 [==============================] - 12s 102ms/step - loss: 0.5184 - accuracy: 0.8430 - val_loss: 1.7804 - val_accuracy: 0.5518
15/15 [==============================] - 1s 35ms/step - loss: 1.6913 - accuracy: 0.5465
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16)
(None, 32, 32, 16)
(None, 16, 16, 16) (None, 16, 16, 32)
(None, 16, 16, 32)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 8, 8, 32) (None, 8, 8, 64)
(None, 8, 8, 64)
Epoch 1/30
119/119 [==============================] - 15s 124ms/step - loss: 2.9280 - accuracy: 0.1108 - val_loss: 2.9893 - val_accuracy: 0.0772
Epoch 2/30
119/119 [==============================] - 14s 114ms/step - loss: 2.6646 - accuracy: 0.2005 - val_loss: 2.9263 - val_accuracy: 0.0899
Epoch 3/30
119/119 [==============================] - 12s 103ms/step - loss: 2.5306 - accuracy: 0.2574 - val_loss: 2.6735 - val_accuracy: 0.1660
Epoch 4/30
119/119 [==============================] - 12s 101ms/step - loss: 2.4293 - accuracy: 0.2880 - val_loss: 2.4811 - val_accuracy: 0.2273
Epoch 5/30
119/119 [==============================] - 12s 100ms/step - loss: 2.3456 - accuracy: 0.3094 - val_loss: 2.3525 - val_accuracy: 0.2738
Epoch 6/30
119/119 [==============================] - 12s 100ms/step - loss: 2.2772 - accuracy: 0.3299 - val_loss: 2.2804 - val_accuracy: 0.2907
Epoch 7/30
119/119 [==============================] - 12s 100ms/step - loss: 2.2209 - accuracy: 0.3380 - val_loss: 2.2241 - val_accuracy: 0.3044
Epoch 8/30
119/119 [==============================] - 12s 102ms/step - loss: 2.1652 - accuracy: 0.3552 - val_loss: 2.2030 - val_accuracy: 0.3034
Epoch 9/30
119/119 [==============================] - 13s 108ms/step - loss: 2.1184 - accuracy: 0.3657 - val_loss: 2.1242 - val_accuracy: 0.3520
Epoch 10/30
119/119 [==============================] - 12s 103ms/step - loss: 2.0811 - accuracy: 0.3759 - val_loss: 2.1016 - val_accuracy: 0.3689
Epoch 11/30
119/119 [==============================] - 12s 102ms/step - loss: 2.0393 - accuracy: 0.3956 - val_loss: 2.0799 - val_accuracy: 0.3679
Epoch 12/30
119/119 [==============================] - 12s 103ms/step - loss: 2.0088 - accuracy: 0.3985 - val_loss: 2.0477 - val_accuracy: 0.3668
Epoch 13/30
119/119 [==============================] - 12s 102ms/step - loss: 1.9777 - accuracy: 0.4102 - val_loss: 2.0073 - val_accuracy: 0.3932
Epoch 14/30
119/119 [==============================] - 12s 104ms/step - loss: 1.9482 - accuracy: 0.4188 - val_loss: 2.0031 - val_accuracy: 0.3901
Epoch 15/30
119/119 [==============================] - 12s 102ms/step - loss: 1.9236 - accuracy: 0.4222 - val_loss: 1.9815 - val_accuracy: 0.3975
Epoch 16/30
119/119 [==============================] - 12s 102ms/step - loss: 1.8935 - accuracy: 0.4388 - val_loss: 1.9787 - val_accuracy: 0.4006
Epoch 17/30
119/119 [==============================] - 12s 101ms/step - loss: 1.8654 - accuracy: 0.4388 - val_loss: 1.9359 - val_accuracy: 0.3901
Epoch 18/30
119/119 [==============================] - 13s 108ms/step - loss: 1.8502 - accuracy: 0.4496 - val_loss: 1.9298 - val_accuracy: 0.4027
Epoch 19/30
119/119 [==============================] - 12s 105ms/step - loss: 1.8193 - accuracy: 0.4563 - val_loss: 1.9242 - val_accuracy: 0.4133
Epoch 20/30
119/119 [==============================] - 12s 102ms/step - loss: 1.7988 - accuracy: 0.4578 - val_loss: 1.9184 - val_accuracy: 0.4123
Epoch 21/30
119/119 [==============================] - 12s 100ms/step - loss: 1.7751 - accuracy: 0.4689 - val_loss: 1.8955 - val_accuracy: 0.4207
Epoch 22/30
119/119 [==============================] - 12s 100ms/step - loss: 1.7590 - accuracy: 0.4742 - val_loss: 1.9157 - val_accuracy: 0.4112
Epoch 23/30
119/119 [==============================] - 12s 102ms/step - loss: 1.7343 - accuracy: 0.4803 - val_loss: 1.8682 - val_accuracy: 0.4260
Epoch 24/30
119/119 [==============================] - 12s 103ms/step - loss: 1.7198 - accuracy: 0.4889 - val_loss: 1.8609 - val_accuracy: 0.4323
Epoch 25/30
119/119 [==============================] - 12s 101ms/step - loss: 1.6944 - accuracy: 0.4956 - val_loss: 1.8286 - val_accuracy: 0.4313
Epoch 26/30
119/119 [==============================] - 12s 100ms/step - loss: 1.6771 - accuracy: 0.4999 - val_loss: 1.8198 - val_accuracy: 0.4419
Epoch 27/30
119/119 [==============================] - 12s 102ms/step - loss: 1.6619 - accuracy: 0.4992 - val_loss: 1.8172 - val_accuracy: 0.4429
Epoch 28/30
119/119 [==============================] - 12s 101ms/step - loss: 1.6442 - accuracy: 0.5086 - val_loss: 1.8119 - val_accuracy: 0.4387
Epoch 29/30
119/119 [==============================] - 12s 102ms/step - loss: 1.6212 - accuracy: 0.5103 - val_loss: 1.7779 - val_accuracy: 0.4493
Epoch 30/30
119/119 [==============================] - 12s 99ms/step - loss: 1.6093 - accuracy: 0.5168 - val_loss: 1.8065 - val_accuracy: 0.4387
15/15 [==============================] - 1s 33ms/step - loss: 1.7299 - accuracy: 0.4704
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16)
(None, 32, 32, 16)
(None, 32, 32, 16) (None, 32, 32, 32)
(None, 32, 32, 32)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 32) (None, 32, 32, 64)
(None, 32, 32, 64)
Epoch 1/30
119/119 [==============================] - 45s 377ms/step - loss: 2.5993 - accuracy: 0.2206 - val_loss: 3.0598 - val_accuracy: 0.0835
Epoch 2/30
119/119 [==============================] - 45s 374ms/step - loss: 2.3132 - accuracy: 0.2948 - val_loss: 2.9004 - val_accuracy: 0.1332
Epoch 3/30
119/119 [==============================] - 31s 256ms/step - loss: 2.1634 - accuracy: 0.3406 - val_loss: 2.5185 - val_accuracy: 0.2082
Epoch 4/30
119/119 [==============================] - 29s 244ms/step - loss: 2.0424 - accuracy: 0.3853 - val_loss: 2.3367 - val_accuracy: 0.2865
Epoch 5/30
119/119 [==============================] - 29s 247ms/step - loss: 1.9609 - accuracy: 0.4034 - val_loss: 2.2986 - val_accuracy: 0.3214
Epoch 6/30
119/119 [==============================] - 31s 260ms/step - loss: 1.8778 - accuracy: 0.4247 - val_loss: 2.6598 - val_accuracy: 0.2569
Epoch 7/30
119/119 [==============================] - 32s 265ms/step - loss: 1.8133 - accuracy: 0.4439 - val_loss: 2.0327 - val_accuracy: 0.3658
Epoch 8/30
119/119 [==============================] - 33s 276ms/step - loss: 1.7591 - accuracy: 0.4550 - val_loss: 2.1733 - val_accuracy: 0.3478
Epoch 9/30
119/119 [==============================] - 30s 252ms/step - loss: 1.6747 - accuracy: 0.4794 - val_loss: 2.1670 - val_accuracy: 0.3573
Epoch 10/30
119/119 [==============================] - 30s 253ms/step - loss: 1.6190 - accuracy: 0.4995 - val_loss: 1.9513 - val_accuracy: 0.3964
Epoch 11/30
119/119 [==============================] - 30s 254ms/step - loss: 1.5573 - accuracy: 0.5228 - val_loss: 1.8114 - val_accuracy: 0.4461
Epoch 12/30
119/119 [==============================] - 30s 255ms/step - loss: 1.4940 - accuracy: 0.5333 - val_loss: 1.8698 - val_accuracy: 0.4313
Epoch 13/30
119/119 [==============================] - 30s 255ms/step - loss: 1.4651 - accuracy: 0.5507 - val_loss: 2.1673 - val_accuracy: 0.3552
Epoch 14/30
119/119 [==============================] - 31s 257ms/step - loss: 1.4159 - accuracy: 0.5569 - val_loss: 1.9580 - val_accuracy: 0.4260
Epoch 15/30
119/119 [==============================] - 31s 261ms/step - loss: 1.3878 - accuracy: 0.5684 - val_loss: 1.7248 - val_accuracy: 0.4630
Epoch 16/30
119/119 [==============================] - 31s 257ms/step - loss: 1.3594 - accuracy: 0.5738 - val_loss: 2.0013 - val_accuracy: 0.3858
Epoch 17/30
119/119 [==============================] - 31s 260ms/step - loss: 1.3107 - accuracy: 0.5922 - val_loss: 1.7925 - val_accuracy: 0.4567
Epoch 18/30
119/119 [==============================] - 31s 260ms/step - loss: 1.2900 - accuracy: 0.5950 - val_loss: 1.7052 - val_accuracy: 0.4778
Epoch 19/30
119/119 [==============================] - 31s 257ms/step - loss: 1.2597 - accuracy: 0.6093 - val_loss: 1.7073 - val_accuracy: 0.4609
Epoch 20/30
119/119 [==============================] - 31s 258ms/step - loss: 1.2434 - accuracy: 0.6165 - val_loss: 1.8809 - val_accuracy: 0.4588
Epoch 21/30
119/119 [==============================] - 31s 257ms/step - loss: 1.1844 - accuracy: 0.6290 - val_loss: 1.9237 - val_accuracy: 0.4662
Epoch 22/30
119/119 [==============================] - 31s 259ms/step - loss: 1.1665 - accuracy: 0.6351 - val_loss: 1.5667 - val_accuracy: 0.5021
Epoch 23/30
119/119 [==============================] - 31s 258ms/step - loss: 1.1333 - accuracy: 0.6464 - val_loss: 1.7319 - val_accuracy: 0.4884
Epoch 24/30
119/119 [==============================] - 31s 259ms/step - loss: 1.1148 - accuracy: 0.6521 - val_loss: 1.7901 - val_accuracy: 0.4725
Epoch 25/30
119/119 [==============================] - 32s 271ms/step - loss: 1.0952 - accuracy: 0.6693 - val_loss: 1.9127 - val_accuracy: 0.4355
Epoch 26/30
119/119 [==============================] - 33s 276ms/step - loss: 1.0709 - accuracy: 0.6661 - val_loss: 1.9725 - val_accuracy: 0.4313
Epoch 27/30
119/119 [==============================] - 31s 261ms/step - loss: 1.0568 - accuracy: 0.6758 - val_loss: 1.6910 - val_accuracy: 0.5137
Epoch 28/30
119/119 [==============================] - 31s 259ms/step - loss: 1.0248 - accuracy: 0.6841 - val_loss: 1.8256 - val_accuracy: 0.4789
Epoch 29/30
119/119 [==============================] - 31s 259ms/step - loss: 1.0102 - accuracy: 0.6840 - val_loss: 1.9483 - val_accuracy: 0.4535
Epoch 30/30
119/119 [==============================] - 31s 260ms/step - loss: 0.9965 - accuracy: 0.6919 - val_loss: 1.6930 - val_accuracy: 0.5095
15/15 [==============================] - 1s 76ms/step - loss: 1.5292 - accuracy: 0.5645
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16)
(None, 32, 32, 16)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 16) (None, 32, 32, 32)
(None, 32, 32, 32)
(None, 32, 32, 32) (None, 32, 32, 64)
(None, 32, 32, 64)
Epoch 1/30
119/119 [==============================] - 31s 258ms/step - loss: 3.0070 - accuracy: 0.1263 - val_loss: 2.9836 - val_accuracy: 0.0655
Epoch 2/30
119/119 [==============================] - 31s 260ms/step - loss: 2.7050 - accuracy: 0.2094 - val_loss: 2.8827 - val_accuracy: 0.1057
Epoch 3/30
119/119 [==============================] - 31s 258ms/step - loss: 2.6049 - accuracy: 0.2329 - val_loss: 2.6769 - val_accuracy: 0.1723
Epoch 4/30
119/119 [==============================] - 31s 260ms/step - loss: 2.5347 - accuracy: 0.2556 - val_loss: 2.5364 - val_accuracy: 0.2336
Epoch 5/30
119/119 [==============================] - 31s 257ms/step - loss: 2.4813 - accuracy: 0.2680 - val_loss: 2.4768 - val_accuracy: 0.2611
Epoch 6/30
119/119 [==============================] - 31s 259ms/step - loss: 2.4375 - accuracy: 0.2817 - val_loss: 2.4490 - val_accuracy: 0.2791
Epoch 7/30
119/119 [==============================] - 31s 258ms/step - loss: 2.4003 - accuracy: 0.2901 - val_loss: 2.3748 - val_accuracy: 0.2960
Epoch 8/30
119/119 [==============================] - 31s 264ms/step - loss: 2.3609 - accuracy: 0.3024 - val_loss: 2.3632 - val_accuracy: 0.2939
Epoch 9/30
119/119 [==============================] - 31s 259ms/step - loss: 2.3312 - accuracy: 0.3104 - val_loss: 2.3478 - val_accuracy: 0.2896
Epoch 10/30
119/119 [==============================] - 31s 259ms/step - loss: 2.3063 - accuracy: 0.3188 - val_loss: 2.3062 - val_accuracy: 0.2981
Epoch 11/30
119/119 [==============================] - 31s 259ms/step - loss: 2.2751 - accuracy: 0.3300 - val_loss: 2.2863 - val_accuracy: 0.3266
Epoch 12/30
119/119 [==============================] - 31s 259ms/step - loss: 2.2487 - accuracy: 0.3312 - val_loss: 2.2497 - val_accuracy: 0.3214
Epoch 13/30
119/119 [==============================] - 31s 260ms/step - loss: 2.2286 - accuracy: 0.3397 - val_loss: 2.2308 - val_accuracy: 0.3192
Epoch 14/30
119/119 [==============================] - 31s 262ms/step - loss: 2.2003 - accuracy: 0.3454 - val_loss: 2.1979 - val_accuracy: 0.3414
Epoch 15/30
119/119 [==============================] - 31s 259ms/step - loss: 2.1801 - accuracy: 0.3475 - val_loss: 2.2015 - val_accuracy: 0.3414
Epoch 16/30
119/119 [==============================] - 31s 259ms/step - loss: 2.1575 - accuracy: 0.3585 - val_loss: 2.1855 - val_accuracy: 0.3541
Epoch 17/30
119/119 [==============================] - 31s 260ms/step - loss: 2.1363 - accuracy: 0.3672 - val_loss: 2.1712 - val_accuracy: 0.3372
Epoch 18/30
119/119 [==============================] - 31s 259ms/step - loss: 2.1228 - accuracy: 0.3754 - val_loss: 2.1378 - val_accuracy: 0.3626
Epoch 19/30
119/119 [==============================] - 31s 260ms/step - loss: 2.0974 - accuracy: 0.3763 - val_loss: 2.1328 - val_accuracy: 0.3827
Epoch 20/30
119/119 [==============================] - 31s 263ms/step - loss: 2.0794 - accuracy: 0.3798 - val_loss: 2.1067 - val_accuracy: 0.3763
Epoch 21/30
119/119 [==============================] - 32s 265ms/step - loss: 2.0613 - accuracy: 0.3870 - val_loss: 2.0842 - val_accuracy: 0.3890
Epoch 22/30
119/119 [==============================] - 31s 260ms/step - loss: 2.0448 - accuracy: 0.3910 - val_loss: 2.0888 - val_accuracy: 0.3816
Epoch 23/30
119/119 [==============================] - 31s 264ms/step - loss: 2.0293 - accuracy: 0.3967 - val_loss: 2.0664 - val_accuracy: 0.3869
Epoch 24/30
119/119 [==============================] - 31s 259ms/step - loss: 2.0159 - accuracy: 0.3988 - val_loss: 2.0503 - val_accuracy: 0.4080
Epoch 25/30
119/119 [==============================] - 31s 264ms/step - loss: 1.9938 - accuracy: 0.4040 - val_loss: 2.0377 - val_accuracy: 0.4080
Epoch 26/30
119/119 [==============================] - 31s 260ms/step - loss: 1.9779 - accuracy: 0.4122 - val_loss: 2.0320 - val_accuracy: 0.4059
Epoch 27/30
119/119 [==============================] - 31s 259ms/step - loss: 1.9679 - accuracy: 0.4085 - val_loss: 2.0045 - val_accuracy: 0.4260
Epoch 28/30
119/119 [==============================] - 31s 258ms/step - loss: 1.9531 - accuracy: 0.4188 - val_loss: 1.9911 - val_accuracy: 0.4027
Epoch 29/30
119/119 [==============================] - 31s 260ms/step - loss: 1.9373 - accuracy: 0.4138 - val_loss: 1.9827 - val_accuracy: 0.4313
Epoch 30/30
119/119 [==============================] - 31s 258ms/step - loss: 1.9282 - accuracy: 0.4214 - val_loss: 1.9677 - val_accuracy: 0.4281
15/15 [==============================] - 1s 74ms/step - loss: 1.9307 - accuracy: 0.4207
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16)
(None, 32, 32, 16)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 16, 16, 16) (None, 16, 16, 32)
(None, 16, 16, 32)
(None, 8, 8, 32) (None, 8, 8, 64)
(None, 8, 8, 64)
Epoch 1/30
119/119 [==============================] - 9s 73ms/step - loss: 2.5564 - accuracy: 0.2312 - val_loss: 3.0646 - val_accuracy: 0.1004
Epoch 2/30
119/119 [==============================] - 9s 72ms/step - loss: 2.1750 - accuracy: 0.3421 - val_loss: 3.2105 - val_accuracy: 0.1628
Epoch 3/30
119/119 [==============================] - 9s 74ms/step - loss: 1.9692 - accuracy: 0.3984 - val_loss: 2.6672 - val_accuracy: 0.2040
Epoch 4/30
119/119 [==============================] - 8s 71ms/step - loss: 1.8273 - accuracy: 0.4467 - val_loss: 2.2065 - val_accuracy: 0.3362
Epoch 5/30
119/119 [==============================] - 8s 70ms/step - loss: 1.7050 - accuracy: 0.4803 - val_loss: 2.5577 - val_accuracy: 0.2907
Epoch 6/30
119/119 [==============================] - 8s 70ms/step - loss: 1.6039 - accuracy: 0.5124 - val_loss: 1.8100 - val_accuracy: 0.4355
Epoch 7/30
119/119 [==============================] - 9s 72ms/step - loss: 1.5307 - accuracy: 0.5292 - val_loss: 1.7782 - val_accuracy: 0.4577
Epoch 8/30
119/119 [==============================] - 8s 71ms/step - loss: 1.4548 - accuracy: 0.5562 - val_loss: 1.7533 - val_accuracy: 0.4471
Epoch 9/30
119/119 [==============================] - 9s 71ms/step - loss: 1.3723 - accuracy: 0.5746 - val_loss: 1.8553 - val_accuracy: 0.4567
Epoch 10/30
119/119 [==============================] - 8s 71ms/step - loss: 1.2927 - accuracy: 0.6041 - val_loss: 1.8957 - val_accuracy: 0.4165
Epoch 11/30
119/119 [==============================] - 8s 71ms/step - loss: 1.2390 - accuracy: 0.6157 - val_loss: 1.8694 - val_accuracy: 0.4419
Epoch 12/30
119/119 [==============================] - 8s 71ms/step - loss: 1.1868 - accuracy: 0.6366 - val_loss: 1.6508 - val_accuracy: 0.5011
Epoch 13/30
119/119 [==============================] - 8s 71ms/step - loss: 1.1314 - accuracy: 0.6519 - val_loss: 1.9929 - val_accuracy: 0.4281
Epoch 14/30
119/119 [==============================] - 9s 72ms/step - loss: 1.1002 - accuracy: 0.6636 - val_loss: 1.7336 - val_accuracy: 0.4894
Epoch 15/30
119/119 [==============================] - 9s 71ms/step - loss: 1.0398 - accuracy: 0.6784 - val_loss: 1.6048 - val_accuracy: 0.5148
Epoch 16/30
119/119 [==============================] - 8s 70ms/step - loss: 0.9964 - accuracy: 0.6911 - val_loss: 2.0252 - val_accuracy: 0.4070
Epoch 17/30
119/119 [==============================] - 8s 70ms/step - loss: 0.9685 - accuracy: 0.6989 - val_loss: 1.8448 - val_accuracy: 0.4746
Epoch 18/30
119/119 [==============================] - 8s 71ms/step - loss: 0.9244 - accuracy: 0.7114 - val_loss: 1.5801 - val_accuracy: 0.5127
Epoch 19/30
119/119 [==============================] - 8s 69ms/step - loss: 0.8682 - accuracy: 0.7286 - val_loss: 1.9696 - val_accuracy: 0.4514
Epoch 20/30
119/119 [==============================] - 8s 71ms/step - loss: 0.8400 - accuracy: 0.7388 - val_loss: 1.7320 - val_accuracy: 0.4958
Epoch 21/30
119/119 [==============================] - 8s 71ms/step - loss: 0.7814 - accuracy: 0.7618 - val_loss: 2.0780 - val_accuracy: 0.4757
Epoch 22/30
119/119 [==============================] - 8s 71ms/step - loss: 0.7626 - accuracy: 0.7601 - val_loss: 1.9845 - val_accuracy: 0.4672
Epoch 23/30
119/119 [==============================] - 8s 71ms/step - loss: 0.7273 - accuracy: 0.7739 - val_loss: 1.9220 - val_accuracy: 0.4989
Epoch 24/30
119/119 [==============================] - 9s 72ms/step - loss: 0.6824 - accuracy: 0.7897 - val_loss: 2.1136 - val_accuracy: 0.4429
Epoch 25/30
119/119 [==============================] - 8s 70ms/step - loss: 0.6467 - accuracy: 0.7993 - val_loss: 1.9164 - val_accuracy: 0.4545
Epoch 26/30
119/119 [==============================] - 9s 71ms/step - loss: 0.6146 - accuracy: 0.8127 - val_loss: 1.7934 - val_accuracy: 0.5063
Epoch 27/30
119/119 [==============================] - 8s 70ms/step - loss: 0.5848 - accuracy: 0.8218 - val_loss: 1.5915 - val_accuracy: 0.5307
Epoch 28/30
119/119 [==============================] - 9s 72ms/step - loss: 0.5579 - accuracy: 0.8304 - val_loss: 2.0371 - val_accuracy: 0.4757
Epoch 29/30
119/119 [==============================] - 8s 70ms/step - loss: 0.5395 - accuracy: 0.8385 - val_loss: 1.7424 - val_accuracy: 0.5127
Epoch 30/30
119/119 [==============================] - 8s 71ms/step - loss: 0.4997 - accuracy: 0.8483 - val_loss: 1.6464 - val_accuracy: 0.5550
15/15 [==============================] - 0s 23ms/step - loss: 1.5557 - accuracy: 0.5867
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16)
(None, 32, 32, 16)
(None, 16, 16, 16) (None, 16, 16, 32)
(None, 16, 16, 32)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 8, 8, 32) (None, 8, 8, 64)
(None, 8, 8, 64)
Epoch 1/30
119/119 [==============================] - 8s 71ms/step - loss: 2.9814 - accuracy: 0.1140 - val_loss: 2.9667 - val_accuracy: 0.0835
Epoch 2/30
119/119 [==============================] - 8s 70ms/step - loss: 2.6891 - accuracy: 0.1938 - val_loss: 2.8303 - val_accuracy: 0.1427
Epoch 3/30
119/119 [==============================] - 8s 71ms/step - loss: 2.5423 - accuracy: 0.2521 - val_loss: 2.5894 - val_accuracy: 0.2156
Epoch 4/30
119/119 [==============================] - 8s 70ms/step - loss: 2.4431 - accuracy: 0.2844 - val_loss: 2.4466 - val_accuracy: 0.2632
Epoch 5/30
119/119 [==============================] - 9s 72ms/step - loss: 2.3616 - accuracy: 0.3021 - val_loss: 2.3605 - val_accuracy: 0.2981
Epoch 6/30
119/119 [==============================] - 8s 69ms/step - loss: 2.2964 - accuracy: 0.3249 - val_loss: 2.3237 - val_accuracy: 0.2949
Epoch 7/30
119/119 [==============================] - 9s 72ms/step - loss: 2.2429 - accuracy: 0.3300 - val_loss: 2.2628 - val_accuracy: 0.3161
Epoch 8/30
119/119 [==============================] - 8s 71ms/step - loss: 2.1861 - accuracy: 0.3504 - val_loss: 2.2352 - val_accuracy: 0.3288
Epoch 9/30
119/119 [==============================] - 8s 71ms/step - loss: 2.1428 - accuracy: 0.3598 - val_loss: 2.2067 - val_accuracy: 0.3277
Epoch 10/30
119/119 [==============================] - 8s 70ms/step - loss: 2.1051 - accuracy: 0.3685 - val_loss: 2.1687 - val_accuracy: 0.3383
Epoch 11/30
119/119 [==============================] - 8s 71ms/step - loss: 2.0656 - accuracy: 0.3798 - val_loss: 2.1352 - val_accuracy: 0.3541
Epoch 12/30
119/119 [==============================] - 9s 73ms/step - loss: 2.0328 - accuracy: 0.3882 - val_loss: 2.1101 - val_accuracy: 0.3510
Epoch 13/30
119/119 [==============================] - 8s 71ms/step - loss: 2.0032 - accuracy: 0.3987 - val_loss: 2.0832 - val_accuracy: 0.3700
Epoch 14/30
119/119 [==============================] - 8s 70ms/step - loss: 1.9717 - accuracy: 0.4060 - val_loss: 2.0381 - val_accuracy: 0.3763
Epoch 15/30
119/119 [==============================] - 9s 71ms/step - loss: 1.9439 - accuracy: 0.4135 - val_loss: 2.0286 - val_accuracy: 0.3922
Epoch 16/30
119/119 [==============================] - 8s 71ms/step - loss: 1.9127 - accuracy: 0.4255 - val_loss: 2.0308 - val_accuracy: 0.3890
Epoch 17/30
119/119 [==============================] - 8s 71ms/step - loss: 1.8875 - accuracy: 0.4279 - val_loss: 1.9925 - val_accuracy: 0.3901
Epoch 18/30
119/119 [==============================] - 8s 71ms/step - loss: 1.8678 - accuracy: 0.4386 - val_loss: 1.9628 - val_accuracy: 0.4070
Epoch 19/30
119/119 [==============================] - 8s 71ms/step - loss: 1.8360 - accuracy: 0.4491 - val_loss: 1.9794 - val_accuracy: 0.4027
Epoch 20/30
119/119 [==============================] - 8s 71ms/step - loss: 1.8166 - accuracy: 0.4552 - val_loss: 1.9391 - val_accuracy: 0.4144
Epoch 21/30
119/119 [==============================] - 8s 71ms/step - loss: 1.7931 - accuracy: 0.4636 - val_loss: 1.9311 - val_accuracy: 0.4017
Epoch 22/30
119/119 [==============================] - 8s 71ms/step - loss: 1.7764 - accuracy: 0.4684 - val_loss: 1.9246 - val_accuracy: 0.4154
Epoch 23/30
119/119 [==============================] - 8s 71ms/step - loss: 1.7477 - accuracy: 0.4742 - val_loss: 1.9544 - val_accuracy: 0.4165
Epoch 24/30
119/119 [==============================] - 8s 71ms/step - loss: 1.7375 - accuracy: 0.4741 - val_loss: 1.9060 - val_accuracy: 0.4154
Epoch 25/30
119/119 [==============================] - 8s 70ms/step - loss: 1.7108 - accuracy: 0.4839 - val_loss: 1.8796 - val_accuracy: 0.4271
Epoch 26/30
119/119 [==============================] - 9s 72ms/step - loss: 1.6926 - accuracy: 0.4896 - val_loss: 1.8658 - val_accuracy: 0.4376
Epoch 27/30
119/119 [==============================] - 8s 70ms/step - loss: 1.6742 - accuracy: 0.4938 - val_loss: 1.8440 - val_accuracy: 0.4249
Epoch 28/30
119/119 [==============================] - 9s 72ms/step - loss: 1.6631 - accuracy: 0.5016 - val_loss: 1.8284 - val_accuracy: 0.4355
Epoch 29/30
119/119 [==============================] - 8s 71ms/step - loss: 1.6380 - accuracy: 0.5108 - val_loss: 1.8169 - val_accuracy: 0.4450
Epoch 30/30
119/119 [==============================] - 9s 72ms/step - loss: 1.6253 - accuracy: 0.5134 - val_loss: 1.8194 - val_accuracy: 0.4514
15/15 [==============================] - 0s 22ms/step - loss: 1.7488 - accuracy: 0.4736
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16)
(None, 32, 32, 16)
(None, 32, 32, 16) (None, 32, 32, 32)
(None, 32, 32, 32)
(None, 32, 32, 32) (None, 32, 32, 64)
(None, 32, 32, 64)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
Epoch 1/30
119/119 [==============================] - 31s 257ms/step - loss: 2.6296 - accuracy: 0.2056 - val_loss: 2.8556 - val_accuracy: 0.0983
Epoch 2/30
119/119 [==============================] - 31s 259ms/step - loss: 2.3787 - accuracy: 0.2836 - val_loss: 2.6694 - val_accuracy: 0.1638
Epoch 3/30
119/119 [==============================] - 31s 258ms/step - loss: 2.2439 - accuracy: 0.3136 - val_loss: 2.3958 - val_accuracy: 0.2326
Epoch 4/30
119/119 [==============================] - 31s 259ms/step - loss: 2.1156 - accuracy: 0.3521 - val_loss: 2.1679 - val_accuracy: 0.3224
Epoch 5/30
119/119 [==============================] - 31s 257ms/step - loss: 2.0314 - accuracy: 0.3769 - val_loss: 2.1703 - val_accuracy: 0.3414
Epoch 6/30
119/119 [==============================] - 32s 265ms/step - loss: 1.9413 - accuracy: 0.4012 - val_loss: 2.3300 - val_accuracy: 0.2822
Epoch 7/30
119/119 [==============================] - 31s 260ms/step - loss: 1.8751 - accuracy: 0.4272 - val_loss: 2.0460 - val_accuracy: 0.3647
Epoch 8/30
119/119 [==============================] - 31s 261ms/step - loss: 1.8073 - accuracy: 0.4460 - val_loss: 2.1693 - val_accuracy: 0.3319
Epoch 9/30
119/119 [==============================] - 31s 259ms/step - loss: 1.7244 - accuracy: 0.4665 - val_loss: 2.0222 - val_accuracy: 0.3689
Epoch 10/30
119/119 [==============================] - 31s 261ms/step - loss: 1.6705 - accuracy: 0.4806 - val_loss: 1.9322 - val_accuracy: 0.3953
Epoch 11/30
119/119 [==============================] - 31s 260ms/step - loss: 1.6042 - accuracy: 0.5037 - val_loss: 2.4594 - val_accuracy: 0.3309
Epoch 12/30
119/119 [==============================] - 31s 261ms/step - loss: 1.5649 - accuracy: 0.5144 - val_loss: 1.9625 - val_accuracy: 0.3869
Epoch 13/30
119/119 [==============================] - 31s 261ms/step - loss: 1.5048 - accuracy: 0.5386 - val_loss: 2.0244 - val_accuracy: 0.4080
Epoch 14/30
119/119 [==============================] - 31s 262ms/step - loss: 1.4788 - accuracy: 0.5362 - val_loss: 1.8352 - val_accuracy: 0.4397
Epoch 15/30
119/119 [==============================] - 31s 261ms/step - loss: 1.4406 - accuracy: 0.5603 - val_loss: 1.8192 - val_accuracy: 0.4408
Epoch 16/30
119/119 [==============================] - 31s 261ms/step - loss: 1.4021 - accuracy: 0.5622 - val_loss: 2.0564 - val_accuracy: 0.3795
Epoch 17/30
119/119 [==============================] - 32s 265ms/step - loss: 1.3629 - accuracy: 0.5792 - val_loss: 1.6949 - val_accuracy: 0.4355
Epoch 18/30
119/119 [==============================] - 31s 263ms/step - loss: 1.3345 - accuracy: 0.5862 - val_loss: 1.6340 - val_accuracy: 0.4662
Epoch 19/30
119/119 [==============================] - 31s 262ms/step - loss: 1.2925 - accuracy: 0.6022 - val_loss: 1.7518 - val_accuracy: 0.4609
Epoch 20/30
119/119 [==============================] - 31s 261ms/step - loss: 1.2891 - accuracy: 0.6021 - val_loss: 1.5895 - val_accuracy: 0.5085
Epoch 21/30
119/119 [==============================] - 31s 262ms/step - loss: 1.2330 - accuracy: 0.6177 - val_loss: 1.6727 - val_accuracy: 0.4810
Epoch 22/30
119/119 [==============================] - 31s 260ms/step - loss: 1.2132 - accuracy: 0.6201 - val_loss: 1.7371 - val_accuracy: 0.4757
Epoch 23/30
119/119 [==============================] - 31s 261ms/step - loss: 1.1826 - accuracy: 0.6384 - val_loss: 1.6990 - val_accuracy: 0.4556
Epoch 24/30
119/119 [==============================] - 31s 260ms/step - loss: 1.1520 - accuracy: 0.6478 - val_loss: 1.7441 - val_accuracy: 0.4725
Epoch 25/30
119/119 [==============================] - 31s 265ms/step - loss: 1.1235 - accuracy: 0.6567 - val_loss: 1.7342 - val_accuracy: 0.4958
Epoch 26/30
119/119 [==============================] - 31s 262ms/step - loss: 1.1014 - accuracy: 0.6574 - val_loss: 2.0277 - val_accuracy: 0.4524
Epoch 27/30
119/119 [==============================] - 31s 262ms/step - loss: 1.0814 - accuracy: 0.6660 - val_loss: 1.6935 - val_accuracy: 0.4778
Epoch 28/30
119/119 [==============================] - 32s 267ms/step - loss: 1.0592 - accuracy: 0.6749 - val_loss: 1.8152 - val_accuracy: 0.4471
Epoch 29/30
119/119 [==============================] - 31s 261ms/step - loss: 1.0396 - accuracy: 0.6762 - val_loss: 1.6763 - val_accuracy: 0.5349
Epoch 30/30
119/119 [==============================] - 31s 261ms/step - loss: 1.0158 - accuracy: 0.6832 - val_loss: 2.1599 - val_accuracy: 0.3996
15/15 [==============================] - 1s 74ms/step - loss: 1.9348 - accuracy: 0.4419
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16)
(None, 32, 32, 16)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 16) (None, 32, 32, 32)
(None, 32, 32, 32)
(None, 32, 32, 32) (None, 32, 32, 64)
(None, 32, 32, 64)
Epoch 1/30
119/119 [==============================] - 31s 262ms/step - loss: 2.8817 - accuracy: 0.1224 - val_loss: 2.9331 - val_accuracy: 0.1068
Epoch 2/30
119/119 [==============================] - 31s 261ms/step - loss: 2.6501 - accuracy: 0.2049 - val_loss: 2.7761 - val_accuracy: 0.1438
Epoch 3/30
119/119 [==============================] - 31s 264ms/step - loss: 2.5590 - accuracy: 0.2479 - val_loss: 2.5903 - val_accuracy: 0.1956
Epoch 4/30
119/119 [==============================] - 32s 267ms/step - loss: 2.4942 - accuracy: 0.2684 - val_loss: 2.4825 - val_accuracy: 0.2526
Epoch 5/30
119/119 [==============================] - 32s 264ms/step - loss: 2.4435 - accuracy: 0.2820 - val_loss: 2.4502 - val_accuracy: 0.2653
Epoch 6/30
119/119 [==============================] - 31s 264ms/step - loss: 2.4077 - accuracy: 0.2878 - val_loss: 2.4252 - val_accuracy: 0.2611
Epoch 7/30
119/119 [==============================] - 31s 262ms/step - loss: 2.3718 - accuracy: 0.3007 - val_loss: 2.3738 - val_accuracy: 0.2886
Epoch 8/30
119/119 [==============================] - 31s 263ms/step - loss: 2.3360 - accuracy: 0.3116 - val_loss: 2.3470 - val_accuracy: 0.2896
Epoch 9/30
119/119 [==============================] - 31s 261ms/step - loss: 2.3061 - accuracy: 0.3169 - val_loss: 2.3270 - val_accuracy: 0.2822
Epoch 10/30
119/119 [==============================] - 32s 269ms/step - loss: 2.2768 - accuracy: 0.3267 - val_loss: 2.2896 - val_accuracy: 0.3087
Epoch 11/30
119/119 [==============================] - 31s 261ms/step - loss: 2.2433 - accuracy: 0.3423 - val_loss: 2.2748 - val_accuracy: 0.3076
Epoch 12/30
119/119 [==============================] - 31s 264ms/step - loss: 2.2172 - accuracy: 0.3407 - val_loss: 2.2461 - val_accuracy: 0.3414
Epoch 13/30
119/119 [==============================] - 31s 265ms/step - loss: 2.1953 - accuracy: 0.3491 - val_loss: 2.2236 - val_accuracy: 0.3192
Epoch 14/30
119/119 [==============================] - 32s 266ms/step - loss: 2.1661 - accuracy: 0.3560 - val_loss: 2.1939 - val_accuracy: 0.3319
Epoch 15/30
119/119 [==============================] - 31s 261ms/step - loss: 2.1459 - accuracy: 0.3628 - val_loss: 2.1866 - val_accuracy: 0.3446
Epoch 16/30
119/119 [==============================] - 31s 262ms/step - loss: 2.1231 - accuracy: 0.3716 - val_loss: 2.1683 - val_accuracy: 0.3636
Epoch 17/30
119/119 [==============================] - 31s 262ms/step - loss: 2.1019 - accuracy: 0.3709 - val_loss: 2.1496 - val_accuracy: 0.3446
Epoch 18/30
119/119 [==============================] - 31s 262ms/step - loss: 2.0892 - accuracy: 0.3807 - val_loss: 2.1291 - val_accuracy: 0.3700
Epoch 19/30
119/119 [==============================] - 31s 262ms/step - loss: 2.0625 - accuracy: 0.3872 - val_loss: 2.1288 - val_accuracy: 0.3732
Epoch 20/30
119/119 [==============================] - 31s 262ms/step - loss: 2.0472 - accuracy: 0.3956 - val_loss: 2.0893 - val_accuracy: 0.3816
Epoch 21/30
119/119 [==============================] - 32s 266ms/step - loss: 2.0298 - accuracy: 0.3978 - val_loss: 2.0850 - val_accuracy: 0.3816
Epoch 22/30
119/119 [==============================] - 31s 264ms/step - loss: 2.0139 - accuracy: 0.3992 - val_loss: 2.0716 - val_accuracy: 0.3953
Epoch 23/30
119/119 [==============================] - 31s 264ms/step - loss: 1.9966 - accuracy: 0.4054 - val_loss: 2.0595 - val_accuracy: 0.3901
Epoch 24/30
119/119 [==============================] - 36s 305ms/step - loss: 1.9844 - accuracy: 0.4075 - val_loss: 2.0470 - val_accuracy: 0.3943
Epoch 25/30
119/119 [==============================] - 32s 268ms/step - loss: 1.9644 - accuracy: 0.4122 - val_loss: 2.0328 - val_accuracy: 0.4017
Epoch 26/30
119/119 [==============================] - 31s 264ms/step - loss: 1.9518 - accuracy: 0.4151 - val_loss: 2.0436 - val_accuracy: 0.3879
Epoch 27/30
119/119 [==============================] - 31s 262ms/step - loss: 1.9390 - accuracy: 0.4257 - val_loss: 2.0132 - val_accuracy: 0.3922
Epoch 28/30
119/119 [==============================] - 31s 261ms/step - loss: 1.9265 - accuracy: 0.4238 - val_loss: 1.9929 - val_accuracy: 0.4154
Epoch 29/30
119/119 [==============================] - 32s 266ms/step - loss: 1.9097 - accuracy: 0.4321 - val_loss: 1.9731 - val_accuracy: 0.4101
Epoch 30/30
119/119 [==============================] - 31s 264ms/step - loss: 1.8956 - accuracy: 0.4364 - val_loss: 1.9788 - val_accuracy: 0.4017
15/15 [==============================] - 1s 77ms/step - loss: 1.9128 - accuracy: 0.4165
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16)
(None, 32, 32, 16)
(None, 16, 16, 16) (None, 16, 16, 32)
(None, 16, 16, 32)
(None, 8, 8, 32) (None, 8, 8, 64)
(None, 8, 8, 64)
(None, 4, 4, 64) (None, 4, 4, 128)
(None, 4, 4, 128)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
Epoch 1/30
119/119 [==============================] - 11s 93ms/step - loss: 2.3834 - accuracy: 0.2689 - val_loss: 3.4486 - val_accuracy: 0.0909
Epoch 2/30
119/119 [==============================] - 20s 171ms/step - loss: 1.9299 - accuracy: 0.4024 - val_loss: 3.0082 - val_accuracy: 0.1850
Epoch 3/30
119/119 [==============================] - 10s 88ms/step - loss: 1.6950 - accuracy: 0.4734 - val_loss: 2.7162 - val_accuracy: 0.2537
Epoch 4/30
119/119 [==============================] - 10s 85ms/step - loss: 1.5239 - accuracy: 0.5292 - val_loss: 2.0900 - val_accuracy: 0.3784
Epoch 5/30
119/119 [==============================] - 11s 93ms/step - loss: 1.3720 - accuracy: 0.5700 - val_loss: 2.0462 - val_accuracy: 0.4080
Epoch 6/30
119/119 [==============================] - 10s 85ms/step - loss: 1.2388 - accuracy: 0.6123 - val_loss: 1.9015 - val_accuracy: 0.4239
Epoch 7/30
119/119 [==============================] - 10s 86ms/step - loss: 1.1414 - accuracy: 0.6459 - val_loss: 1.9362 - val_accuracy: 0.4577
Epoch 8/30
119/119 [==============================] - 10s 84ms/step - loss: 1.0127 - accuracy: 0.6856 - val_loss: 1.7292 - val_accuracy: 0.4915
Epoch 9/30
119/119 [==============================] - 10s 85ms/step - loss: 0.8881 - accuracy: 0.7237 - val_loss: 1.7947 - val_accuracy: 0.4968
Epoch 10/30
119/119 [==============================] - 10s 86ms/step - loss: 0.7735 - accuracy: 0.7612 - val_loss: 2.2651 - val_accuracy: 0.4038
Epoch 11/30
119/119 [==============================] - 10s 85ms/step - loss: 0.6581 - accuracy: 0.7989 - val_loss: 2.1604 - val_accuracy: 0.4323
Epoch 12/30
119/119 [==============================] - 10s 86ms/step - loss: 0.5880 - accuracy: 0.8221 - val_loss: 2.1410 - val_accuracy: 0.4810
Epoch 13/30
119/119 [==============================] - 10s 87ms/step - loss: 0.4658 - accuracy: 0.8626 - val_loss: 2.3875 - val_accuracy: 0.4471
Epoch 14/30
119/119 [==============================] - 10s 86ms/step - loss: 0.3569 - accuracy: 0.8942 - val_loss: 2.3137 - val_accuracy: 0.4440
Epoch 15/30
119/119 [==============================] - 10s 86ms/step - loss: 0.2936 - accuracy: 0.9201 - val_loss: 2.4142 - val_accuracy: 0.4619
Epoch 16/30
119/119 [==============================] - 10s 85ms/step - loss: 0.3102 - accuracy: 0.9108 - val_loss: 2.0854 - val_accuracy: 0.5095
Epoch 17/30
119/119 [==============================] - 10s 88ms/step - loss: 0.1978 - accuracy: 0.9497 - val_loss: 2.2813 - val_accuracy: 0.4598
Epoch 18/30
119/119 [==============================] - 11s 88ms/step - loss: 0.1296 - accuracy: 0.9716 - val_loss: 2.6530 - val_accuracy: 0.4556
Epoch 19/30
119/119 [==============================] - 10s 86ms/step - loss: 0.1847 - accuracy: 0.9467 - val_loss: 2.5972 - val_accuracy: 0.4556
Epoch 20/30
119/119 [==============================] - 10s 86ms/step - loss: 0.2408 - accuracy: 0.9269 - val_loss: 2.9407 - val_accuracy: 0.4440
Epoch 21/30
119/119 [==============================] - 10s 86ms/step - loss: 0.1143 - accuracy: 0.9720 - val_loss: 2.1467 - val_accuracy: 0.5328
Epoch 22/30
119/119 [==============================] - 10s 84ms/step - loss: 0.0754 - accuracy: 0.9848 - val_loss: 2.1907 - val_accuracy: 0.5190
Epoch 23/30
119/119 [==============================] - 10s 86ms/step - loss: 0.1872 - accuracy: 0.9459 - val_loss: 2.6617 - val_accuracy: 0.4672
Epoch 24/30
119/119 [==============================] - 10s 86ms/step - loss: 0.0693 - accuracy: 0.9839 - val_loss: 2.7640 - val_accuracy: 0.4725
Epoch 25/30
119/119 [==============================] - 10s 84ms/step - loss: 0.0858 - accuracy: 0.9794 - val_loss: 2.8388 - val_accuracy: 0.4535
Epoch 26/30
119/119 [==============================] - 10s 87ms/step - loss: 0.0361 - accuracy: 0.9935 - val_loss: 2.2049 - val_accuracy: 0.5349
Epoch 27/30
119/119 [==============================] - 10s 85ms/step - loss: 0.0601 - accuracy: 0.9856 - val_loss: 2.9170 - val_accuracy: 0.4767
Epoch 28/30
119/119 [==============================] - 10s 86ms/step - loss: 0.2449 - accuracy: 0.9209 - val_loss: 2.6399 - val_accuracy: 0.4937
Epoch 29/30
119/119 [==============================] - 10s 87ms/step - loss: 0.1345 - accuracy: 0.9624 - val_loss: 2.3574 - val_accuracy: 0.5349
Epoch 30/30
119/119 [==============================] - 10s 85ms/step - loss: 0.0469 - accuracy: 0.9923 - val_loss: 2.3174 - val_accuracy: 0.5381
15/15 [==============================] - 0s 28ms/step - loss: 2.0742 - accuracy: 0.5877
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16)
(None, 32, 32, 16)
(None, 16, 16, 16) (None, 16, 16, 32)
(None, 16, 16, 32)
(None, 8, 8, 32) (None, 8, 8, 64)
(None, 8, 8, 64)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 4, 4, 64) (None, 4, 4, 128)
(None, 4, 4, 128)
Epoch 1/30
119/119 [==============================] - 10s 85ms/step - loss: 2.6846 - accuracy: 0.1992 - val_loss: 3.0725 - val_accuracy: 0.0529
Epoch 2/30
119/119 [==============================] - 10s 84ms/step - loss: 2.2291 - accuracy: 0.3372 - val_loss: 3.0370 - val_accuracy: 0.1290
Epoch 3/30
119/119 [==============================] - 10s 83ms/step - loss: 2.0459 - accuracy: 0.3844 - val_loss: 2.8209 - val_accuracy: 0.1744
Epoch 4/30
119/119 [==============================] - 10s 87ms/step - loss: 1.9152 - accuracy: 0.4337 - val_loss: 2.4176 - val_accuracy: 0.2463
Epoch 5/30
119/119 [==============================] - 10s 87ms/step - loss: 1.8100 - accuracy: 0.4604 - val_loss: 1.9824 - val_accuracy: 0.3710
Epoch 6/30
119/119 [==============================] - 10s 83ms/step - loss: 1.7306 - accuracy: 0.4853 - val_loss: 1.9383 - val_accuracy: 0.3964
Epoch 7/30
119/119 [==============================] - 10s 84ms/step - loss: 1.6643 - accuracy: 0.4976 - val_loss: 1.8491 - val_accuracy: 0.4080
Epoch 8/30
119/119 [==============================] - 10s 84ms/step - loss: 1.5924 - accuracy: 0.5193 - val_loss: 1.8359 - val_accuracy: 0.4207
Epoch 9/30
119/119 [==============================] - 10s 84ms/step - loss: 1.5182 - accuracy: 0.5475 - val_loss: 1.8025 - val_accuracy: 0.4355
Epoch 10/30
119/119 [==============================] - 10s 85ms/step - loss: 1.4730 - accuracy: 0.5582 - val_loss: 1.7979 - val_accuracy: 0.4355
Epoch 11/30
119/119 [==============================] - 10s 84ms/step - loss: 1.4068 - accuracy: 0.5829 - val_loss: 1.7832 - val_accuracy: 0.4366
Epoch 12/30
119/119 [==============================] - 10s 85ms/step - loss: 1.3543 - accuracy: 0.5976 - val_loss: 1.7405 - val_accuracy: 0.4535
Epoch 13/30
119/119 [==============================] - 10s 84ms/step - loss: 1.3037 - accuracy: 0.6120 - val_loss: 1.7437 - val_accuracy: 0.4609
Epoch 14/30
119/119 [==============================] - 10s 84ms/step - loss: 1.2492 - accuracy: 0.6306 - val_loss: 1.7147 - val_accuracy: 0.4715
Epoch 15/30
119/119 [==============================] - 10s 85ms/step - loss: 1.2082 - accuracy: 0.6488 - val_loss: 1.7831 - val_accuracy: 0.4683
Epoch 16/30
119/119 [==============================] - 10s 83ms/step - loss: 1.1538 - accuracy: 0.6676 - val_loss: 1.7509 - val_accuracy: 0.4567
Epoch 17/30
119/119 [==============================] - 10s 85ms/step - loss: 1.1030 - accuracy: 0.6790 - val_loss: 1.7289 - val_accuracy: 0.4630
Epoch 18/30
119/119 [==============================] - 10s 84ms/step - loss: 1.0699 - accuracy: 0.6980 - val_loss: 1.7040 - val_accuracy: 0.4852
Epoch 19/30
119/119 [==============================] - 10s 84ms/step - loss: 1.0016 - accuracy: 0.7204 - val_loss: 1.7678 - val_accuracy: 0.4926
Epoch 20/30
119/119 [==============================] - 10s 86ms/step - loss: 0.9692 - accuracy: 0.7272 - val_loss: 1.7062 - val_accuracy: 0.4736
Epoch 21/30
119/119 [==============================] - 10s 84ms/step - loss: 0.9211 - accuracy: 0.7440 - val_loss: 1.7093 - val_accuracy: 0.4863
Epoch 22/30
119/119 [==============================] - 10s 83ms/step - loss: 0.8873 - accuracy: 0.7549 - val_loss: 1.7337 - val_accuracy: 0.4863
Epoch 23/30
119/119 [==============================] - 10s 85ms/step - loss: 0.8455 - accuracy: 0.7716 - val_loss: 1.7671 - val_accuracy: 0.4810
Epoch 24/30
119/119 [==============================] - 10s 84ms/step - loss: 0.7977 - accuracy: 0.7858 - val_loss: 1.7042 - val_accuracy: 0.4863
Epoch 25/30
119/119 [==============================] - 10s 84ms/step - loss: 0.7565 - accuracy: 0.7999 - val_loss: 1.7460 - val_accuracy: 0.4947
Epoch 26/30
119/119 [==============================] - 10s 85ms/step - loss: 0.7211 - accuracy: 0.8122 - val_loss: 1.7431 - val_accuracy: 0.5011
Epoch 27/30
119/119 [==============================] - 10s 83ms/step - loss: 0.6795 - accuracy: 0.8275 - val_loss: 1.7441 - val_accuracy: 0.4757
Epoch 28/30
119/119 [==============================] - 10s 84ms/step - loss: 0.6486 - accuracy: 0.8356 - val_loss: 1.7915 - val_accuracy: 0.4915
Epoch 29/30
119/119 [==============================] - 10s 85ms/step - loss: 0.6115 - accuracy: 0.8550 - val_loss: 1.7964 - val_accuracy: 0.5021
Epoch 30/30
119/119 [==============================] - 10s 84ms/step - loss: 0.5825 - accuracy: 0.8598 - val_loss: 1.7533 - val_accuracy: 0.4947
15/15 [==============================] - 0s 27ms/step - loss: 1.6744 - accuracy: 0.5201
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16)
(None, 32, 32, 16)
(None, 32, 32, 16) (None, 32, 32, 32)
(None, 32, 32, 32)
(None, 32, 32, 32) (None, 32, 32, 64)
(None, 32, 32, 64)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 64) (None, 32, 32, 128)
(None, 32, 32, 128)
Epoch 1/30
119/119 [==============================] - 92s 768ms/step - loss: 2.5061 - accuracy: 0.2392 - val_loss: 3.4325 - val_accuracy: 0.0782
Epoch 2/30
119/119 [==============================] - 92s 774ms/step - loss: 2.1400 - accuracy: 0.3385 - val_loss: 3.3008 - val_accuracy: 0.1501
Epoch 3/30
119/119 [==============================] - 91s 768ms/step - loss: 1.9637 - accuracy: 0.3882 - val_loss: 2.5070 - val_accuracy: 0.2812
Epoch 4/30
119/119 [==============================] - 94s 789ms/step - loss: 1.8381 - accuracy: 0.4343 - val_loss: 2.0858 - val_accuracy: 0.3753
Epoch 5/30
119/119 [==============================] - 95s 796ms/step - loss: 1.7251 - accuracy: 0.4607 - val_loss: 2.4850 - val_accuracy: 0.3161
Epoch 6/30
119/119 [==============================] - 93s 779ms/step - loss: 1.6210 - accuracy: 0.4950 - val_loss: 1.9385 - val_accuracy: 0.4006
Epoch 7/30
119/119 [==============================] - 92s 770ms/step - loss: 1.5456 - accuracy: 0.5159 - val_loss: 1.9169 - val_accuracy: 0.4080
Epoch 8/30
119/119 [==============================] - 92s 772ms/step - loss: 1.4859 - accuracy: 0.5369 - val_loss: 2.1015 - val_accuracy: 0.3922
Epoch 9/30
119/119 [==============================] - 92s 774ms/step - loss: 1.4041 - accuracy: 0.5604 - val_loss: 2.3631 - val_accuracy: 0.3658
Epoch 10/30
119/119 [==============================] - 93s 781ms/step - loss: 1.3373 - accuracy: 0.5787 - val_loss: 1.8812 - val_accuracy: 0.4440
Epoch 11/30
119/119 [==============================] - 92s 771ms/step - loss: 1.2806 - accuracy: 0.6016 - val_loss: 2.5300 - val_accuracy: 0.3298
Epoch 12/30
119/119 [==============================] - 92s 774ms/step - loss: 1.2212 - accuracy: 0.6242 - val_loss: 1.8837 - val_accuracy: 0.4408
Epoch 13/30
119/119 [==============================] - 92s 773ms/step - loss: 1.1705 - accuracy: 0.6331 - val_loss: 2.0375 - val_accuracy: 0.4503
Epoch 14/30
119/119 [==============================] - 92s 774ms/step - loss: 1.1305 - accuracy: 0.6483 - val_loss: 2.6786 - val_accuracy: 0.3488
Epoch 15/30
119/119 [==============================] - 91s 769ms/step - loss: 1.0788 - accuracy: 0.6599 - val_loss: 1.6440 - val_accuracy: 0.5381
Epoch 16/30
119/119 [==============================] - 91s 768ms/step - loss: 1.0483 - accuracy: 0.6697 - val_loss: 2.5340 - val_accuracy: 0.3869
Epoch 17/30
119/119 [==============================] - 93s 782ms/step - loss: 0.9899 - accuracy: 0.6934 - val_loss: 1.6603 - val_accuracy: 0.4968
Epoch 18/30
119/119 [==============================] - 93s 780ms/step - loss: 0.9572 - accuracy: 0.7004 - val_loss: 1.5398 - val_accuracy: 0.5539
Epoch 19/30
119/119 [==============================] - 92s 775ms/step - loss: 0.9097 - accuracy: 0.7177 - val_loss: 1.7929 - val_accuracy: 0.4979
Epoch 20/30
119/119 [==============================] - 92s 772ms/step - loss: 0.8833 - accuracy: 0.7218 - val_loss: 1.8473 - val_accuracy: 0.4757
Epoch 21/30
119/119 [==============================] - 93s 784ms/step - loss: 0.8216 - accuracy: 0.7417 - val_loss: 1.8516 - val_accuracy: 0.4958
Epoch 22/30
119/119 [==============================] - 91s 768ms/step - loss: 0.8137 - accuracy: 0.7497 - val_loss: 1.9918 - val_accuracy: 0.4926
Epoch 23/30
119/119 [==============================] - 92s 774ms/step - loss: 0.7512 - accuracy: 0.7706 - val_loss: 1.8060 - val_accuracy: 0.4831
Epoch 24/30
119/119 [==============================] - 93s 783ms/step - loss: 0.7303 - accuracy: 0.7741 - val_loss: 3.1188 - val_accuracy: 0.3467
Epoch 25/30
119/119 [==============================] - 93s 785ms/step - loss: 0.6880 - accuracy: 0.7888 - val_loss: 2.7505 - val_accuracy: 0.4440
Epoch 26/30
119/119 [==============================] - 92s 771ms/step - loss: 0.6537 - accuracy: 0.7983 - val_loss: 2.5711 - val_accuracy: 0.4038
Epoch 27/30
119/119 [==============================] - 92s 772ms/step - loss: 0.6306 - accuracy: 0.8067 - val_loss: 1.7983 - val_accuracy: 0.4968
Epoch 28/30
119/119 [==============================] - 92s 776ms/step - loss: 0.6035 - accuracy: 0.8185 - val_loss: 1.4119 - val_accuracy: 0.5877
Epoch 29/30
119/119 [==============================] - 93s 782ms/step - loss: 0.5894 - accuracy: 0.8247 - val_loss: 1.8459 - val_accuracy: 0.5349
Epoch 30/30
119/119 [==============================] - 92s 774ms/step - loss: 0.5462 - accuracy: 0.8327 - val_loss: 2.5446 - val_accuracy: 0.4334
15/15 [==============================] - 3s 223ms/step - loss: 2.4449 - accuracy: 0.4588
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16)
(None, 32, 32, 16)
(None, 32, 32, 16) (None, 32, 32, 32)
(None, 32, 32, 32)
(None, 32, 32, 32) (None, 32, 32, 64)
(None, 32, 32, 64)
(None, 32, 32, 64) (None, 32, 32, 128)
(None, 32, 32, 128)
Epoch 1/30
119/119 [==============================] - 93s 779ms/step - loss: 2.7825 - accuracy: 0.1837 - val_loss: 3.0065 - val_accuracy: 0.0655
Epoch 2/30
119/119 [==============================] - 93s 785ms/step - loss: 2.4630 - accuracy: 0.2571 - val_loss: 2.8331 - val_accuracy: 0.1173
Epoch 3/30
119/119 [==============================] - 94s 790ms/step - loss: 2.3604 - accuracy: 0.2903 - val_loss: 2.5022 - val_accuracy: 0.2230
Epoch 4/30
119/119 [==============================] - 93s 779ms/step - loss: 2.2689 - accuracy: 0.3184 - val_loss: 2.3186 - val_accuracy: 0.2653
Epoch 5/30
119/119 [==============================] - 92s 773ms/step - loss: 2.1968 - accuracy: 0.3452 - val_loss: 2.2480 - val_accuracy: 0.2918
Epoch 6/30
119/119 [==============================] - 93s 781ms/step - loss: 2.1397 - accuracy: 0.3509 - val_loss: 2.2450 - val_accuracy: 0.3140
Epoch 7/30
119/119 [==============================] - 92s 771ms/step - loss: 2.0902 - accuracy: 0.3675 - val_loss: 2.1603 - val_accuracy: 0.3235
Epoch 8/30
119/119 [==============================] - 92s 771ms/step - loss: 2.0456 - accuracy: 0.3817 - val_loss: 2.1233 - val_accuracy: 0.3393
Epoch 9/30
119/119 [==============================] - 92s 769ms/step - loss: 2.0043 - accuracy: 0.3937 - val_loss: 2.0802 - val_accuracy: 0.3520
Epoch 10/30
119/119 [==============================] - 93s 780ms/step - loss: 1.9670 - accuracy: 0.4077 - val_loss: 2.0376 - val_accuracy: 0.3700
Epoch 11/30
119/119 [==============================] - 92s 771ms/step - loss: 1.9261 - accuracy: 0.4175 - val_loss: 2.0072 - val_accuracy: 0.3858
Epoch 12/30
119/119 [==============================] - 92s 773ms/step - loss: 1.8915 - accuracy: 0.4271 - val_loss: 2.0146 - val_accuracy: 0.3901
Epoch 13/30
119/119 [==============================] - 94s 788ms/step - loss: 1.8642 - accuracy: 0.4376 - val_loss: 1.9898 - val_accuracy: 0.3911
Epoch 14/30
119/119 [==============================] - 95s 801ms/step - loss: 1.8304 - accuracy: 0.4459 - val_loss: 1.9197 - val_accuracy: 0.4133
Epoch 15/30
119/119 [==============================] - 93s 779ms/step - loss: 1.8056 - accuracy: 0.4615 - val_loss: 1.9078 - val_accuracy: 0.4070
Epoch 16/30
119/119 [==============================] - 92s 769ms/step - loss: 1.7728 - accuracy: 0.4669 - val_loss: 1.9442 - val_accuracy: 0.4080
Epoch 17/30
119/119 [==============================] - 92s 771ms/step - loss: 1.7483 - accuracy: 0.4697 - val_loss: 1.8707 - val_accuracy: 0.4186
Epoch 18/30
119/119 [==============================] - 92s 773ms/step - loss: 1.7280 - accuracy: 0.4847 - val_loss: 1.8787 - val_accuracy: 0.4186
Epoch 19/30
119/119 [==============================] - 93s 778ms/step - loss: 1.6937 - accuracy: 0.4958 - val_loss: 1.8903 - val_accuracy: 0.4419
Epoch 20/30
119/119 [==============================] - 92s 772ms/step - loss: 1.6760 - accuracy: 0.5009 - val_loss: 1.8343 - val_accuracy: 0.4249
Epoch 21/30
119/119 [==============================] - 92s 771ms/step - loss: 1.6454 - accuracy: 0.5046 - val_loss: 1.8050 - val_accuracy: 0.4440
Epoch 22/30
119/119 [==============================] - 93s 780ms/step - loss: 1.6285 - accuracy: 0.5123 - val_loss: 1.8272 - val_accuracy: 0.4408
Epoch 23/30
119/119 [==============================] - 92s 775ms/step - loss: 1.6033 - accuracy: 0.5274 - val_loss: 1.8058 - val_accuracy: 0.4524
Epoch 24/30
119/119 [==============================] - 92s 771ms/step - loss: 1.5884 - accuracy: 0.5283 - val_loss: 1.7961 - val_accuracy: 0.4588
Epoch 25/30
119/119 [==============================] - 91s 767ms/step - loss: 1.5607 - accuracy: 0.5253 - val_loss: 1.8142 - val_accuracy: 0.4493
Epoch 26/30
119/119 [==============================] - 93s 778ms/step - loss: 1.5357 - accuracy: 0.5433 - val_loss: 1.8431 - val_accuracy: 0.4387
Epoch 27/30
119/119 [==============================] - 92s 773ms/step - loss: 1.5262 - accuracy: 0.5489 - val_loss: 1.7957 - val_accuracy: 0.4630
Epoch 28/30
119/119 [==============================] - 92s 774ms/step - loss: 1.5147 - accuracy: 0.5585 - val_loss: 1.7441 - val_accuracy: 0.4715
Epoch 29/30
119/119 [==============================] - 92s 770ms/step - loss: 1.4942 - accuracy: 0.5582 - val_loss: 1.6971 - val_accuracy: 0.4915
Epoch 30/30
119/119 [==============================] - 93s 779ms/step - loss: 1.4704 - accuracy: 0.5623 - val_loss: 1.6810 - val_accuracy: 0.5032
15/15 [==============================] - 3s 207ms/step - loss: 1.6090 - accuracy: 0.5042
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16)
(None, 32, 32, 16)
(None, 16, 16, 16) (None, 16, 16, 32)
(None, 16, 16, 32)
(None, 8, 8, 32) (None, 8, 8, 64)
(None, 8, 8, 64)
(None, 4, 4, 64) (None, 4, 4, 128)
(None, 4, 4, 128)
Epoch 1/30
119/119 [==============================] - 10s 87ms/step - loss: 2.3688 - accuracy: 0.2776 - val_loss: 3.2430 - val_accuracy: 0.0877
Epoch 2/30
119/119 [==============================] - 10s 85ms/step - loss: 1.9497 - accuracy: 0.3999 - val_loss: 2.8502 - val_accuracy: 0.1776
Epoch 3/30
119/119 [==============================] - 10s 86ms/step - loss: 1.7221 - accuracy: 0.4696 - val_loss: 2.5000 - val_accuracy: 0.2474
Epoch 4/30
119/119 [==============================] - 10s 85ms/step - loss: 1.5607 - accuracy: 0.5180 - val_loss: 2.0859 - val_accuracy: 0.3975
Epoch 5/30
119/119 [==============================] - 10s 86ms/step - loss: 1.4028 - accuracy: 0.5639 - val_loss: 1.8938 - val_accuracy: 0.4260
Epoch 6/30
119/119 [==============================] - 10s 86ms/step - loss: 1.2712 - accuracy: 0.6065 - val_loss: 1.8184 - val_accuracy: 0.4630
Epoch 7/30
119/119 [==============================] - 10s 85ms/step - loss: 1.1500 - accuracy: 0.6458 - val_loss: 2.0550 - val_accuracy: 0.4017
Epoch 8/30
119/119 [==============================] - 10s 85ms/step - loss: 1.0292 - accuracy: 0.6747 - val_loss: 1.9461 - val_accuracy: 0.4281
Epoch 9/30
119/119 [==============================] - 10s 86ms/step - loss: 0.9093 - accuracy: 0.7159 - val_loss: 2.0212 - val_accuracy: 0.4334
Epoch 10/30
119/119 [==============================] - 10s 85ms/step - loss: 0.8079 - accuracy: 0.7491 - val_loss: 1.9461 - val_accuracy: 0.4757
Epoch 11/30
119/119 [==============================] - 10s 86ms/step - loss: 0.6547 - accuracy: 0.7992 - val_loss: 1.9913 - val_accuracy: 0.4736
Epoch 12/30
119/119 [==============================] - 10s 85ms/step - loss: 0.5649 - accuracy: 0.8245 - val_loss: 1.8740 - val_accuracy: 0.5000
Epoch 13/30
119/119 [==============================] - 10s 85ms/step - loss: 0.4339 - accuracy: 0.8747 - val_loss: 2.6336 - val_accuracy: 0.4091
Epoch 14/30
119/119 [==============================] - 10s 84ms/step - loss: 0.3573 - accuracy: 0.9004 - val_loss: 2.1331 - val_accuracy: 0.4641
Epoch 15/30
119/119 [==============================] - 10s 85ms/step - loss: 0.3025 - accuracy: 0.9152 - val_loss: 2.7283 - val_accuracy: 0.3964
Epoch 16/30
119/119 [==============================] - 10s 85ms/step - loss: 0.2696 - accuracy: 0.9220 - val_loss: 2.2520 - val_accuracy: 0.4556
Epoch 17/30
119/119 [==============================] - 10s 85ms/step - loss: 0.1764 - accuracy: 0.9567 - val_loss: 3.2836 - val_accuracy: 0.3911
Epoch 18/30
119/119 [==============================] - 10s 84ms/step - loss: 0.2429 - accuracy: 0.9272 - val_loss: 2.9551 - val_accuracy: 0.4281
Epoch 19/30
119/119 [==============================] - 10s 86ms/step - loss: 0.1590 - accuracy: 0.9589 - val_loss: 2.4871 - val_accuracy: 0.4693
Epoch 20/30
119/119 [==============================] - 10s 85ms/step - loss: 0.1034 - accuracy: 0.9778 - val_loss: 2.6174 - val_accuracy: 0.4567
Epoch 21/30
119/119 [==============================] - 10s 86ms/step - loss: 0.0703 - accuracy: 0.9862 - val_loss: 2.4813 - val_accuracy: 0.4746
Epoch 22/30
119/119 [==============================] - 10s 85ms/step - loss: 0.1117 - accuracy: 0.9702 - val_loss: 2.3009 - val_accuracy: 0.4884
Epoch 23/30
119/119 [==============================] - 10s 85ms/step - loss: 0.0915 - accuracy: 0.9767 - val_loss: 2.5429 - val_accuracy: 0.4820
Epoch 24/30
119/119 [==============================] - 10s 84ms/step - loss: 0.0607 - accuracy: 0.9884 - val_loss: 3.0438 - val_accuracy: 0.4609
Epoch 25/30
119/119 [==============================] - 10s 85ms/step - loss: 0.1500 - accuracy: 0.9537 - val_loss: 3.0143 - val_accuracy: 0.4387
Epoch 26/30
119/119 [==============================] - 10s 85ms/step - loss: 0.0676 - accuracy: 0.9841 - val_loss: 3.0755 - val_accuracy: 0.4598
Epoch 27/30
119/119 [==============================] - 10s 85ms/step - loss: 0.0304 - accuracy: 0.9960 - val_loss: 2.2313 - val_accuracy: 0.5338
Epoch 28/30
119/119 [==============================] - 10s 85ms/step - loss: 0.0476 - accuracy: 0.9905 - val_loss: 2.6797 - val_accuracy: 0.5011
Epoch 29/30
119/119 [==============================] - 10s 84ms/step - loss: 0.1747 - accuracy: 0.9447 - val_loss: 3.6261 - val_accuracy: 0.4006
Epoch 30/30
119/119 [==============================] - 10s 85ms/step - loss: 0.1255 - accuracy: 0.9604 - val_loss: 2.9908 - val_accuracy: 0.4630
15/15 [==============================] - 0s 28ms/step - loss: 2.7010 - accuracy: 0.4937
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16)
(None, 32, 32, 16)
(None, 16, 16, 16) (None, 16, 16, 32)
(None, 16, 16, 32)
(None, 8, 8, 32) (None, 8, 8, 64)
(None, 8, 8, 64)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 4, 4, 64) (None, 4, 4, 128)
(None, 4, 4, 128)
Epoch 1/30
119/119 [==============================] - 11s 88ms/step - loss: 2.7374 - accuracy: 0.1974 - val_loss: 3.0253 - val_accuracy: 0.0698
Epoch 2/30
119/119 [==============================] - 10s 86ms/step - loss: 2.3000 - accuracy: 0.3249 - val_loss: 3.1451 - val_accuracy: 0.0951
Epoch 3/30
119/119 [==============================] - 10s 86ms/step - loss: 2.1184 - accuracy: 0.3692 - val_loss: 2.7375 - val_accuracy: 0.1966
Epoch 4/30
119/119 [==============================] - 10s 85ms/step - loss: 1.9682 - accuracy: 0.4147 - val_loss: 2.1943 - val_accuracy: 0.3214
Epoch 5/30
119/119 [==============================] - 10s 83ms/step - loss: 1.8559 - accuracy: 0.4460 - val_loss: 2.0400 - val_accuracy: 0.3700
Epoch 6/30
119/119 [==============================] - 10s 84ms/step - loss: 1.7722 - accuracy: 0.4697 - val_loss: 1.9833 - val_accuracy: 0.3996
Epoch 7/30
119/119 [==============================] - 10s 86ms/step - loss: 1.6983 - accuracy: 0.4874 - val_loss: 1.8708 - val_accuracy: 0.4091
Epoch 8/30
119/119 [==============================] - 10s 84ms/step - loss: 1.6215 - accuracy: 0.5126 - val_loss: 1.8408 - val_accuracy: 0.4345
Epoch 9/30
119/119 [==============================] - 10s 85ms/step - loss: 1.5580 - accuracy: 0.5323 - val_loss: 1.8092 - val_accuracy: 0.4408
Epoch 10/30
119/119 [==============================] - 10s 84ms/step - loss: 1.4958 - accuracy: 0.5474 - val_loss: 1.8041 - val_accuracy: 0.4228
Epoch 11/30
119/119 [==============================] - 10s 84ms/step - loss: 1.4365 - accuracy: 0.5681 - val_loss: 1.7991 - val_accuracy: 0.4471
Epoch 12/30
119/119 [==============================] - 10s 84ms/step - loss: 1.3794 - accuracy: 0.5847 - val_loss: 1.7582 - val_accuracy: 0.4524
Epoch 13/30
119/119 [==============================] - 10s 85ms/step - loss: 1.3263 - accuracy: 0.5976 - val_loss: 1.7915 - val_accuracy: 0.4440
Epoch 14/30
119/119 [==============================] - 10s 86ms/step - loss: 1.2739 - accuracy: 0.6192 - val_loss: 1.7526 - val_accuracy: 0.4440
Epoch 15/30
119/119 [==============================] - 10s 83ms/step - loss: 1.2243 - accuracy: 0.6377 - val_loss: 1.7689 - val_accuracy: 0.4524
Epoch 16/30
119/119 [==============================] - 10s 84ms/step - loss: 1.1676 - accuracy: 0.6536 - val_loss: 1.6976 - val_accuracy: 0.4799
Epoch 17/30
119/119 [==============================] - 10s 84ms/step - loss: 1.1202 - accuracy: 0.6704 - val_loss: 1.6949 - val_accuracy: 0.4704
Epoch 18/30
119/119 [==============================] - 10s 84ms/step - loss: 1.0843 - accuracy: 0.6881 - val_loss: 1.7078 - val_accuracy: 0.4831
Epoch 19/30
119/119 [==============================] - 10s 84ms/step - loss: 1.0266 - accuracy: 0.7036 - val_loss: 1.6967 - val_accuracy: 0.4799
Epoch 20/30
119/119 [==============================] - 10s 84ms/step - loss: 0.9905 - accuracy: 0.7139 - val_loss: 1.7196 - val_accuracy: 0.4683
Epoch 21/30
119/119 [==============================] - 10s 84ms/step - loss: 0.9409 - accuracy: 0.7356 - val_loss: 1.7235 - val_accuracy: 0.4757
Epoch 22/30
119/119 [==============================] - 10s 85ms/step - loss: 0.9000 - accuracy: 0.7466 - val_loss: 1.7483 - val_accuracy: 0.4767
Epoch 23/30
119/119 [==============================] - 10s 84ms/step - loss: 0.8537 - accuracy: 0.7642 - val_loss: 1.7941 - val_accuracy: 0.4641
Epoch 24/30
119/119 [==============================] - 10s 84ms/step - loss: 0.8117 - accuracy: 0.7734 - val_loss: 1.7263 - val_accuracy: 0.4789
Epoch 25/30
119/119 [==============================] - 10s 85ms/step - loss: 0.7765 - accuracy: 0.7913 - val_loss: 1.7320 - val_accuracy: 0.4831
Epoch 26/30
119/119 [==============================] - 10s 84ms/step - loss: 0.7347 - accuracy: 0.8066 - val_loss: 1.7658 - val_accuracy: 0.4841
Epoch 27/30
119/119 [==============================] - 10s 84ms/step - loss: 0.6961 - accuracy: 0.8221 - val_loss: 1.7102 - val_accuracy: 0.4989
Epoch 28/30
119/119 [==============================] - 10s 85ms/step - loss: 0.6617 - accuracy: 0.8312 - val_loss: 1.7539 - val_accuracy: 0.4831
Epoch 29/30
119/119 [==============================] - 10s 83ms/step - loss: 0.6246 - accuracy: 0.8435 - val_loss: 1.7262 - val_accuracy: 0.4968
Epoch 30/30
119/119 [==============================] - 10s 84ms/step - loss: 0.5900 - accuracy: 0.8567 - val_loss: 1.7446 - val_accuracy: 0.4979
15/15 [==============================] - 0s 29ms/step - loss: 1.7125 - accuracy: 0.5000
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16)
(None, 32, 32, 16)
(None, 32, 32, 16) (None, 32, 32, 32)
(None, 32, 32, 32)
(None, 32, 32, 32) (None, 32, 32, 64)
(None, 32, 32, 64)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 64) (None, 32, 32, 128)
(None, 32, 32, 128)
Epoch 1/30
119/119 [==============================] - 93s 778ms/step - loss: 2.4797 - accuracy: 0.2426 - val_loss: 3.2703 - val_accuracy: 0.0751
Epoch 2/30
119/119 [==============================] - 92s 771ms/step - loss: 2.1218 - accuracy: 0.3447 - val_loss: 3.5075 - val_accuracy: 0.1036
Epoch 3/30
119/119 [==============================] - 92s 771ms/step - loss: 1.9604 - accuracy: 0.3886 - val_loss: 2.7113 - val_accuracy: 0.1712
Epoch 4/30
119/119 [==============================] - 93s 778ms/step - loss: 1.8223 - accuracy: 0.4385 - val_loss: 2.1073 - val_accuracy: 0.3256
Epoch 5/30
119/119 [==============================] - 93s 781ms/step - loss: 1.7025 - accuracy: 0.4733 - val_loss: 2.7600 - val_accuracy: 0.2801
Epoch 6/30
119/119 [==============================] - 92s 770ms/step - loss: 1.5933 - accuracy: 0.5062 - val_loss: 2.0854 - val_accuracy: 0.3626
Epoch 7/30
119/119 [==============================] - 92s 773ms/step - loss: 1.5110 - accuracy: 0.5312 - val_loss: 1.8774 - val_accuracy: 0.4154
Epoch 8/30
119/119 [==============================] - 91s 768ms/step - loss: 1.4501 - accuracy: 0.5468 - val_loss: 2.0774 - val_accuracy: 0.3531
Epoch 9/30
119/119 [==============================] - 94s 787ms/step - loss: 1.3587 - accuracy: 0.5761 - val_loss: 2.2962 - val_accuracy: 0.3763
Epoch 10/30
119/119 [==============================] - 92s 776ms/step - loss: 1.3063 - accuracy: 0.5950 - val_loss: 2.2757 - val_accuracy: 0.3605
Epoch 11/30
119/119 [==============================] - 92s 773ms/step - loss: 1.2354 - accuracy: 0.6153 - val_loss: 1.8795 - val_accuracy: 0.4397
Epoch 12/30
119/119 [==============================] - 92s 772ms/step - loss: 1.1946 - accuracy: 0.6312 - val_loss: 1.8045 - val_accuracy: 0.4810
Epoch 13/30
119/119 [==============================] - 93s 782ms/step - loss: 1.1389 - accuracy: 0.6456 - val_loss: 2.7180 - val_accuracy: 0.3869
Epoch 14/30
119/119 [==============================] - 92s 773ms/step - loss: 1.0965 - accuracy: 0.6554 - val_loss: 2.5244 - val_accuracy: 0.3076
Epoch 15/30
119/119 [==============================] - 92s 777ms/step - loss: 1.0353 - accuracy: 0.6772 - val_loss: 2.3583 - val_accuracy: 0.4027
Epoch 16/30
119/119 [==============================] - 93s 781ms/step - loss: 1.0066 - accuracy: 0.6843 - val_loss: 1.9134 - val_accuracy: 0.4545
Epoch 17/30
119/119 [==============================] - 95s 795ms/step - loss: 0.9455 - accuracy: 0.7075 - val_loss: 1.7326 - val_accuracy: 0.4841
Epoch 18/30
119/119 [==============================] - 92s 773ms/step - loss: 0.9183 - accuracy: 0.7060 - val_loss: 2.0191 - val_accuracy: 0.4059
Epoch 19/30
119/119 [==============================] - 92s 773ms/step - loss: 0.8671 - accuracy: 0.7340 - val_loss: 3.5855 - val_accuracy: 0.3541
Epoch 20/30
119/119 [==============================] - 92s 773ms/step - loss: 0.8450 - accuracy: 0.7370 - val_loss: 1.6131 - val_accuracy: 0.5254
Epoch 21/30
119/119 [==============================] - 93s 778ms/step - loss: 0.7802 - accuracy: 0.7513 - val_loss: 2.8614 - val_accuracy: 0.3753
Epoch 22/30
119/119 [==============================] - 92s 776ms/step - loss: 0.7714 - accuracy: 0.7626 - val_loss: 1.6595 - val_accuracy: 0.5497
Epoch 23/30
119/119 [==============================] - 92s 774ms/step - loss: 0.7118 - accuracy: 0.7821 - val_loss: 1.9047 - val_accuracy: 0.4651
Epoch 24/30
119/119 [==============================] - 93s 778ms/step - loss: 0.6736 - accuracy: 0.7946 - val_loss: 1.5402 - val_accuracy: 0.5423
Epoch 25/30
119/119 [==============================] - 92s 769ms/step - loss: 0.6358 - accuracy: 0.8053 - val_loss: 1.7402 - val_accuracy: 0.5190
Epoch 26/30
119/119 [==============================] - 92s 770ms/step - loss: 0.6005 - accuracy: 0.8188 - val_loss: 1.7459 - val_accuracy: 0.5391
Epoch 27/30
119/119 [==============================] - 92s 770ms/step - loss: 0.5894 - accuracy: 0.8151 - val_loss: 1.4233 - val_accuracy: 0.5740
Epoch 28/30
119/119 [==============================] - 93s 779ms/step - loss: 0.5485 - accuracy: 0.8298 - val_loss: 2.3639 - val_accuracy: 0.4249
Epoch 29/30
119/119 [==============================] - 92s 774ms/step - loss: 0.5281 - accuracy: 0.8398 - val_loss: 1.8030 - val_accuracy: 0.5169
Epoch 30/30
119/119 [==============================] - 92s 774ms/step - loss: 0.4911 - accuracy: 0.8513 - val_loss: 2.3424 - val_accuracy: 0.4693
15/15 [==============================] - 3s 207ms/step - loss: 2.1687 - accuracy: 0.4937
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16)
(None, 32, 32, 16)
(None, 32, 32, 16) (None, 32, 32, 32)
(None, 32, 32, 32)
(None, 32, 32, 32) (None, 32, 32, 64)
(None, 32, 32, 64)
(None, 32, 32, 64) (None, 32, 32, 128)
(None, 32, 32, 128)
Epoch 1/30
119/119 [==============================] - 92s 771ms/step - loss: 2.7917 - accuracy: 0.1835 - val_loss: 3.1278 - val_accuracy: 0.0761
Epoch 2/30
119/119 [==============================] - 92s 777ms/step - loss: 2.4628 - accuracy: 0.2754 - val_loss: 3.0352 - val_accuracy: 0.1004
Epoch 3/30
119/119 [==============================] - 93s 778ms/step - loss: 2.3483 - accuracy: 0.2951 - val_loss: 2.6549 - val_accuracy: 0.1892
Epoch 4/30
119/119 [==============================] - 92s 772ms/step - loss: 2.2489 - accuracy: 0.3216 - val_loss: 2.3375 - val_accuracy: 0.3002
Epoch 5/30
119/119 [==============================] - 94s 786ms/step - loss: 2.1659 - accuracy: 0.3525 - val_loss: 2.2989 - val_accuracy: 0.2812
Epoch 6/30
119/119 [==============================] - 94s 787ms/step - loss: 2.1007 - accuracy: 0.3656 - val_loss: 2.2593 - val_accuracy: 0.3044
Epoch 7/30
119/119 [==============================] - 92s 770ms/step - loss: 2.0516 - accuracy: 0.3866 - val_loss: 2.1470 - val_accuracy: 0.3340
Epoch 8/30
119/119 [==============================] - 92s 773ms/step - loss: 1.9989 - accuracy: 0.3987 - val_loss: 2.0914 - val_accuracy: 0.3710
Epoch 9/30
119/119 [==============================] - 93s 781ms/step - loss: 1.9603 - accuracy: 0.4111 - val_loss: 2.0989 - val_accuracy: 0.3520
Epoch 10/30
119/119 [==============================] - 92s 775ms/step - loss: 1.9232 - accuracy: 0.4239 - val_loss: 2.0475 - val_accuracy: 0.3605
Epoch 11/30
119/119 [==============================] - 91s 769ms/step - loss: 1.8817 - accuracy: 0.4324 - val_loss: 2.0483 - val_accuracy: 0.3975
Epoch 12/30
119/119 [==============================] - 92s 770ms/step - loss: 1.8541 - accuracy: 0.4422 - val_loss: 1.9675 - val_accuracy: 0.3953
Epoch 13/30
119/119 [==============================] - 92s 774ms/step - loss: 1.8195 - accuracy: 0.4549 - val_loss: 2.0158 - val_accuracy: 0.3932
Epoch 14/30
119/119 [==============================] - 92s 773ms/step - loss: 1.7932 - accuracy: 0.4597 - val_loss: 1.9514 - val_accuracy: 0.4038
Epoch 15/30
119/119 [==============================] - 91s 768ms/step - loss: 1.7682 - accuracy: 0.4730 - val_loss: 1.9430 - val_accuracy: 0.4027
Epoch 16/30
119/119 [==============================] - 91s 768ms/step - loss: 1.7372 - accuracy: 0.4810 - val_loss: 1.8798 - val_accuracy: 0.4154
Epoch 17/30
119/119 [==============================] - 93s 778ms/step - loss: 1.7101 - accuracy: 0.4906 - val_loss: 1.8655 - val_accuracy: 0.4281
Epoch 18/30
119/119 [==============================] - 91s 769ms/step - loss: 1.6952 - accuracy: 0.4918 - val_loss: 1.8215 - val_accuracy: 0.4408
Epoch 19/30
119/119 [==============================] - 92s 771ms/step - loss: 1.6544 - accuracy: 0.5077 - val_loss: 1.8703 - val_accuracy: 0.4197
Epoch 20/30
119/119 [==============================] - 92s 774ms/step - loss: 1.6425 - accuracy: 0.5085 - val_loss: 1.8420 - val_accuracy: 0.4355
Epoch 21/30
119/119 [==============================] - 92s 774ms/step - loss: 1.6178 - accuracy: 0.5138 - val_loss: 1.8107 - val_accuracy: 0.4366
Epoch 22/30
119/119 [==============================] - 92s 773ms/step - loss: 1.6006 - accuracy: 0.5209 - val_loss: 1.8185 - val_accuracy: 0.4281
Epoch 23/30
119/119 [==============================] - 92s 771ms/step - loss: 1.5753 - accuracy: 0.5224 - val_loss: 1.8106 - val_accuracy: 0.4387
Epoch 24/30
119/119 [==============================] - 93s 786ms/step - loss: 1.5625 - accuracy: 0.5299 - val_loss: 1.8971 - val_accuracy: 0.4271
Epoch 25/30
119/119 [==============================] - 95s 802ms/step - loss: 1.5351 - accuracy: 0.5380 - val_loss: 1.7986 - val_accuracy: 0.4535
Epoch 26/30
119/119 [==============================] - 93s 777ms/step - loss: 1.5152 - accuracy: 0.5438 - val_loss: 1.9466 - val_accuracy: 0.4080
Epoch 27/30
119/119 [==============================] - 92s 775ms/step - loss: 1.5019 - accuracy: 0.5505 - val_loss: 1.7746 - val_accuracy: 0.4503
Epoch 28/30
119/119 [==============================] - 93s 783ms/step - loss: 1.4850 - accuracy: 0.5622 - val_loss: 1.7164 - val_accuracy: 0.4577
Epoch 29/30
119/119 [==============================] - 92s 771ms/step - loss: 1.4686 - accuracy: 0.5642 - val_loss: 1.7044 - val_accuracy: 0.4725
Epoch 30/30
119/119 [==============================] - 93s 778ms/step - loss: 1.4490 - accuracy: 0.5706 - val_loss: 1.7182 - val_accuracy: 0.4715
15/15 [==============================] - 3s 217ms/step - loss: 1.5830 - accuracy: 0.5159
In [224]:
print (best_config)
(4, True, True, 0.001, 0.5877378582954407)

**Question 3.3** We now try to apply data augmentation to improve the performance. Extend the code of the class YourModel so that if the attribute is_augmentation is set to True, we apply the data augmentation. Also you need to incorporate early stopping to your training process. Specifically, you early stop the training if the valid accuracy cannot increase in three consecutive epochs.

[4 points]
In [53]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import EarlyStopping

Wtire your code in the cell below. Hint that you can rewrite the code of the fit method to apply the data augmentation. In addition, you can copy the code of build_cnn method above to reuse here.

In [86]:
class YourModel(DefaultModel):
    def __init__(self,num_channels,blocks,mean_pool,batch_norm,use_skip,learning_rate,verbose,
                 name='network1',
                 width=32, height=32, depth=3,
                 num_classes=20, 
                 is_augmentation = False,
                 activation_func='relu',
                 optimizer='adam',
                 batch_size=32,
                 num_epochs= 20):
        super(YourModel, self).__init__(name, width, height, depth, num_classes, is_augmentation, 
                                        activation_func, optimizer, batch_size, num_epochs, 
                                        learning_rate, verbose)
        self.num_channels = num_channels
        self.mean_pool = mean_pool
        self.batch_norm = batch_norm
        self.use_skip = use_skip
        self.blocks = blocks
    
    
    def build_cnn(self,x):
        #Insert your code here
        x1 = layers.Conv2D(self.num_channels, (3, 3), strides=(1, 1), padding='same')(x)
        if self.batch_norm:
            x1 = layers.BatchNormalization() (x1)
        x1 = layers.Activation('relu') (x1)
        x2 = layers.Conv2D(self.num_channels, (3, 3), strides=(1, 1), padding='same')(x1)
        if self.batch_norm:
                x2 = layers.BatchNormalization()(x2)
        if x.shape != x2.shape:
            if x2.shape[3] > x.shape[3]:
                pad_tns = tf.constant([[0,0],[0, 0] ,[0,0], [x2.shape[3] - x.shape[3] , 0]])
                x = tf.pad(x, pad_tns, mode ='CONSTANT', constant_values=0)
            else:
                pad_tns = tf.constant([[0,0],[0, 0] ,[0,0], [x.shape[3] - x2.shape[3] , 0]])
                x = tf.pad(x2, pad_tns, mode ='CONSTANT', constant_values=0)
        x2_skip = layers.add([x, x2])
        x2_skip = layers.Activation('relu')(x2_skip)
        if self.mean_pool:
            output_layer = layers.AveragePooling2D(pool_size=(2, 2), padding='same')(x2_skip)
        else: 
            output_layer = x2_skip
        return output_layer
    
    def build_resnet(self):
        self.input_layer = layers.Input(shape=(self.width, self.height, self.depth))
        x = self.input_layer
        for i in range (self.blocks):
            x = self.build_cnn(x)
            self.num_channels = self.num_channels*2
        output_layer = GlobalAveragePooling2D()(x)
        output_layer = layers.Flatten()(output_layer)
        output_layer = layers.Dense(self.num_classes, activation='softmax')(output_layer)
        self.model = tf.keras.models.Model(inputs=self.input_layer, outputs=output_layer)
        self.model.compile(optimizer=self.optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])

    def fit(self, data_manager, batch_size=None, num_epochs=None):
        #Insert your code here
        batch_size = self.batch_size if batch_size is None else batch_size
        num_epochs = self.num_epochs if num_epochs is None else num_epochs
        self.model.compile(optimizer=self.optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
        early_stopping = EarlyStopping(monitor='val_accuracy', patience=3)
        callbacks = [early_stopping]
        if self.is_augmentation == True:
            datagen = tf.keras.preprocessing.image.ImageDataGenerator(
                width_shift_range=0.05,
                zoom_range = 0.05,
                rotation_range=5
            )
            datagen.fit(data_manager.X_train)
            it = datagen.flow(data_manager.X_train,data_manager.y_train, shuffle = True, batch_size =batch_size)
            self.history = self.model.fit(x = it, 
                          validation_data = (data_manager.X_valid, data_manager.y_valid), 
                          epochs = num_epochs, batch_size = batch_size, callbacks = callbacks,verbose= self.verbose)
        else:
            self.history = self.model.fit(x = data_manager.X_train, y = data_manager.y_train,
                                      validation_data = (data_manager.X_valid, data_manager.y_valid), 
                                      epochs = num_epochs, batch_size = batch_size, callbacks = callbacks,verbose= self.verbose)
        

Leverage your best model with the data augmentation and try to observe the difference in performance between using data augmentation and non-using it.

Write your answer and observation here¶

By leveraging the findings from above (best config):

Blocks Skip Pool Rate Accuracy
4 True True 0.001 58.77%

I have tried to train it on both with and without data augmentation. Let's look at the results here:

With Data Augmentation + Early Stopping Accuracy:

  • Training Accuracy: 88.47%
  • Validation Accuracy: 56.98%
  • Test Accuracy: 57.5%

Without Data Augmentation + Early Stopping Accuracy:

  • Training Accuracy: 89.71%
  • Validation Accuracy: 52.33%
  • Test Accuracy: 52.1%

As we can see, data augmentation produces lower training accuracy when compared to without data augmentation. This is due to the fact that data augmentation makes changes to the original training data thus diversifying it and allowing our model to learn how to generalise better instead of relying solely on learning patterns.This can be seen in the higher validation and test accuracy in our model with data augmentation

In [90]:
#Insert your code here. You can add more cells if necessary
testModel = YourModel(32,4, True, True,True, 0.001,True)
testModel.build_resnet()
testModel.fit(data_manager, batch_size = 16, num_epochs = 30)
testModel.compute_accuracy(data_manager.X_test, data_manager.y_test)
Epoch 1/30
473/473 [==============================] - 20s 41ms/step - loss: 2.5141 - accuracy: 0.2286 - val_loss: 2.5227 - val_accuracy: 0.2442
Epoch 2/30
473/473 [==============================] - 20s 41ms/step - loss: 2.1007 - accuracy: 0.3468 - val_loss: 2.0654 - val_accuracy: 0.3668
Epoch 3/30
473/473 [==============================] - 20s 42ms/step - loss: 1.8436 - accuracy: 0.4198 - val_loss: 1.8843 - val_accuracy: 0.4228
Epoch 4/30
473/473 [==============================] - 20s 41ms/step - loss: 1.6569 - accuracy: 0.4819 - val_loss: 1.7000 - val_accuracy: 0.4662
Epoch 5/30
473/473 [==============================] - 19s 41ms/step - loss: 1.4680 - accuracy: 0.5409 - val_loss: 1.7330 - val_accuracy: 0.4746
Epoch 6/30
473/473 [==============================] - 20s 43ms/step - loss: 1.3347 - accuracy: 0.5811 - val_loss: 1.6279 - val_accuracy: 0.5148
Epoch 7/30
473/473 [==============================] - 19s 41ms/step - loss: 1.2063 - accuracy: 0.6224 - val_loss: 1.5319 - val_accuracy: 0.5423
Epoch 8/30
473/473 [==============================] - 20s 42ms/step - loss: 1.0968 - accuracy: 0.6548 - val_loss: 1.5071 - val_accuracy: 0.5497
Epoch 9/30
473/473 [==============================] - 19s 41ms/step - loss: 0.9708 - accuracy: 0.6910 - val_loss: 1.7288 - val_accuracy: 0.5349
Epoch 10/30
473/473 [==============================] - 19s 40ms/step - loss: 0.8555 - accuracy: 0.7279 - val_loss: 1.7070 - val_accuracy: 0.5201
Epoch 11/30
473/473 [==============================] - 19s 40ms/step - loss: 0.7318 - accuracy: 0.7672 - val_loss: 1.5414 - val_accuracy: 0.5698
Epoch 12/30
473/473 [==============================] - 19s 40ms/step - loss: 0.6186 - accuracy: 0.8026 - val_loss: 1.4291 - val_accuracy: 0.6099
Epoch 13/30
473/473 [==============================] - 19s 41ms/step - loss: 0.5340 - accuracy: 0.8303 - val_loss: 1.6177 - val_accuracy: 0.5793
Epoch 14/30
473/473 [==============================] - 19s 40ms/step - loss: 0.4371 - accuracy: 0.8618 - val_loss: 1.6358 - val_accuracy: 0.6004
Epoch 15/30
473/473 [==============================] - 19s 40ms/step - loss: 0.3581 - accuracy: 0.8847 - val_loss: 1.7178 - val_accuracy: 0.5698
15/15 [==============================] - 0s 20ms/step - loss: 1.6682 - accuracy: 0.5751
Out[90]:
0.5750528573989868
In [92]:
testModel.plot_progress()
testModel.plot_prediction(data_manager.X_test, data_manager.y_test, data_manager.classes)
30/30 [==============================] - 0s 11ms/step
<Figure size 640x480 with 0 Axes>
In [93]:
testModel.model.save('models/augmentation_true_model.h5')
In [96]:
class YourModel(DefaultModel):
    def __init__(self,num_channels,blocks,mean_pool,batch_norm,use_skip,learning_rate,verbose,
                 name='network1',
                 width=32, height=32, depth=3,
                 num_classes=20, 
                 is_augmentation = False,
                 activation_func='relu',
                 optimizer='adam',
                 batch_size=32,
                 num_epochs= 20):
        super(YourModel, self).__init__(name, width, height, depth, num_classes, is_augmentation, 
                                        activation_func, optimizer, batch_size, num_epochs, 
                                        learning_rate, verbose)
        self.num_channels = num_channels
        self.mean_pool = mean_pool
        self.batch_norm = batch_norm
        self.use_skip = use_skip
        self.blocks = blocks
    
    
    def build_cnn(self,x):
        #Insert your code here
        x1 = layers.Conv2D(self.num_channels, (3, 3), strides=(1, 1), padding='same')(x)
        if self.batch_norm:
            x1 = layers.BatchNormalization() (x1)
        x1 = layers.Activation('relu') (x1)
        x2 = layers.Conv2D(self.num_channels, (3, 3), strides=(1, 1), padding='same')(x1)
        if self.batch_norm:
                x2 = layers.BatchNormalization()(x2)
        if x.shape != x2.shape:
            if x2.shape[3] > x.shape[3]:
                pad_tns = tf.constant([[0,0],[0, 0] ,[0,0], [x2.shape[3] - x.shape[3] , 0]])
                x = tf.pad(x, pad_tns, mode ='CONSTANT', constant_values=0)
            else:
                pad_tns = tf.constant([[0,0],[0, 0] ,[0,0], [x.shape[3] - x2.shape[3] , 0]])
                x = tf.pad(x2, pad_tns, mode ='CONSTANT', constant_values=0)
        x2_skip = layers.add([x, x2])
        x2_skip = layers.Activation('relu')(x2_skip)
        if self.mean_pool:
            output_layer = layers.AveragePooling2D(pool_size=(2, 2), padding='same')(x2_skip)
        else: 
            output_layer = x2_skip
        return output_layer
    
    def build_resnet(self):
        self.input_layer = layers.Input(shape=(self.width, self.height, self.depth))
        x = self.input_layer
        for i in range (self.blocks):
            x = self.build_cnn(x)
            self.num_channels = self.num_channels*2
        output_layer = GlobalAveragePooling2D()(x)
        output_layer = layers.Flatten()(output_layer)
        output_layer = layers.Dense(self.num_classes, activation='softmax')(output_layer)
        self.model = tf.keras.models.Model(inputs=self.input_layer, outputs=output_layer)
        self.model.compile(optimizer=self.optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])

    def fit(self, data_manager, batch_size=None, num_epochs=None):
        #Insert your code here
        batch_size = self.batch_size if batch_size is None else batch_size
        num_epochs = self.num_epochs if num_epochs is None else num_epochs
        self.model.compile(optimizer=self.optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
        early_stopping = EarlyStopping(monitor='val_accuracy', patience=3)
        callbacks = [early_stopping]
        if self.is_augmentation == True:
            datagen = tf.keras.preprocessing.image.ImageDataGenerator(
                brightness_range=(0.9, 1.1),
                horizontal_flip=True,
                width_shift_range=0.2,
                height_shift_range=0.1,
                rotation_range= 10,
                zoom_range = 0.1
            )
            datagen.fit(data_manager.X_train)
            it = datagen.flow(data_manager.X_train,data_manager.y_train, shuffle = False, batch_size =batch_size)
            self.history = self.model.fit(x = it, 
                          validation_data = (data_manager.X_valid, data_manager.y_valid), 
                          epochs = num_epochs, batch_size = batch_size, callbacks = callbacks,verbose= self.verbose)
        else:
            self.history = self.model.fit(x = data_manager.X_train, y = data_manager.y_train,
                                      validation_data = (data_manager.X_valid, data_manager.y_valid), 
                                      epochs = num_epochs, batch_size = batch_size, callbacks = callbacks,verbose= self.verbose)
        
In [97]:
testModel = YourModel(32,4, True, True,True, 0.001,True)
testModel.build_resnet()
testModel.fit(data_manager, batch_size = 16, num_epochs = 20)
testModel.compute_accuracy(data_manager.X_test, data_manager.y_test)
Epoch 1/20
473/473 [==============================] - 21s 44ms/step - loss: 2.4680 - accuracy: 0.2439 - val_loss: 2.4805 - val_accuracy: 0.2463
Epoch 2/20
473/473 [==============================] - 20s 43ms/step - loss: 2.0364 - accuracy: 0.3698 - val_loss: 2.0306 - val_accuracy: 0.3837
Epoch 3/20
473/473 [==============================] - 20s 42ms/step - loss: 1.7709 - accuracy: 0.4516 - val_loss: 1.8833 - val_accuracy: 0.4397
Epoch 4/20
473/473 [==============================] - 20s 42ms/step - loss: 1.5909 - accuracy: 0.5071 - val_loss: 1.7732 - val_accuracy: 0.4545
Epoch 5/20
473/473 [==============================] - 20s 41ms/step - loss: 1.4055 - accuracy: 0.5636 - val_loss: 1.5745 - val_accuracy: 0.5095
Epoch 6/20
473/473 [==============================] - 19s 41ms/step - loss: 1.2683 - accuracy: 0.6052 - val_loss: 1.5717 - val_accuracy: 0.5148
Epoch 7/20
473/473 [==============================] - 20s 42ms/step - loss: 1.1440 - accuracy: 0.6467 - val_loss: 1.5202 - val_accuracy: 0.5613
Epoch 8/20
473/473 [==============================] - 20s 42ms/step - loss: 1.0157 - accuracy: 0.6758 - val_loss: 1.4315 - val_accuracy: 0.5645
Epoch 9/20
473/473 [==============================] - 20s 42ms/step - loss: 0.8867 - accuracy: 0.7190 - val_loss: 1.9850 - val_accuracy: 0.4799
Epoch 10/20
473/473 [==============================] - 20s 42ms/step - loss: 0.7710 - accuracy: 0.7516 - val_loss: 1.5861 - val_accuracy: 0.5455
Epoch 11/20
473/473 [==============================] - 20s 42ms/step - loss: 0.6408 - accuracy: 0.7909 - val_loss: 1.6589 - val_accuracy: 0.5793
Epoch 12/20
473/473 [==============================] - 20s 42ms/step - loss: 0.5228 - accuracy: 0.8315 - val_loss: 1.7419 - val_accuracy: 0.5603
Epoch 13/20
473/473 [==============================] - 20s 42ms/step - loss: 0.4382 - accuracy: 0.8575 - val_loss: 1.7577 - val_accuracy: 0.5687
Epoch 14/20
473/473 [==============================] - 20s 41ms/step - loss: 0.3249 - accuracy: 0.8971 - val_loss: 2.1832 - val_accuracy: 0.5233
15/15 [==============================] - 0s 19ms/step - loss: 2.2280 - accuracy: 0.5211
Out[97]:
0.5211416482925415
In [100]:
testModel.plot_progress()
testModel.plot_prediction(data_manager.X_test, data_manager.y_test, data_manager.classes)
30/30 [==============================] - 0s 16ms/step
<Figure size 640x480 with 0 Axes>
In [101]:
testModel.model.save('models/augmentation_false_model.h5')

**Question 3.4** Exploring Data Mixup Technique for Improving Generalization Ability.

[4 points]

Data mixup is another super-simple technique used to boost the generalization ability of deep learning models. You need to incoroporate data mixup technique to the above deep learning model and experiment its performance. There are some papers and documents for data mixup as follows:

  • Main paper for data mixup link for main paper and a good article article link.

You need to extend your model developed above, train a model using data mixup, and write your observations and comments about the result.

Write your answer and observation here¶

With Data Augmentation + Early Stopping Accuracy:

  • Training Accuracy: 88.47%
  • Validation Accuracy: 56.98%
  • Test Accuracy: 57.5%

With all the above and data mixup:

  • Training Accuracy: 71.14%
  • Validation Accuracy: 52.85%
  • Test Accuracy: 48.31%

We can observe a decrease in performance when data mixup is added. This can be due to several factors which includes: The introduced augmentation to the data may mean our model is too simple to recognise patterns within them and may be struggling / identifying the wrong patterns. This can be rectified by increasing model complexity (adding layers etc.) and training epochs so that the model can learn to deal with the augmentations.

In [175]:
sp = SimplePreprocessor(width=32, height=32)
data_manager = DatasetManager([sp])
data_manager.load(label_folder_dict, verbose=100)
data_manager.process_data_label()
data_manager.train_valid_test_split()
birds 512
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
bottles 432
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
breads 432
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
butterfiles 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
cakes 432
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
cats 501
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
chickens 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
cows 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
dogs 501
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
ducks 496
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
elephants 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
fishes 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
handguns 448
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
horses 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
lions 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
lipsticks 400
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
seals 448
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
snakes 496
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
spiders 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
vases 368
Processed 100/500
Processed 200/500
Processed 300/500
In [176]:
from keras.utils import to_categorical
import random

def mixup(data_manager,batch_size,alpha=0.2):
    l = len(data_manager.X_train)
    mixed_data = []
    mixed_labels = []
    lam = np.random.beta(alpha,alpha)
    print ("shape before mixup: X: ",data_manager.X_train.shape," y: ",data_manager.y_train.shape)
    for i in range(0, l, 4):
        if i + 1 < l:
            x1, y1 = data_manager.X_train[i], data_manager.y_train[i]
            x2, y2 = data_manager.X_train[i + 1], data_manager.y_train[i + 1]
            # Generate random mixing coefficient from a beta distribution
            lam = np.random.beta(alpha,alpha)
            mixed_x = (lam * x1) + ((1 - lam) * x2)
            mixed_y = (lam * y1) + ((1 - lam) * y2)
            mixed_data.append(mixed_x)
            mixed_labels.append(mixed_y)

    data_manager.X_train = np.concatenate((data_manager.X_train, np.array(mixed_data)))
    data_manager.y_train = np.concatenate((data_manager.y_train,np.array(mixed_labels)))
In [177]:
class YourModel(DefaultModel):
    def __init__(self,num_channels,blocks,mean_pool,batch_norm,use_skip,learning_rate,verbose,
                 name='network1',
                 width=32, height=32, depth=3,
                 num_classes=20, 
                 is_augmentation = True,
                 activation_func='relu',
                 optimizer='adam',
                 batch_size=32,
                 num_epochs= 20):
        super(YourModel, self).__init__(name, width, height, depth, num_classes, is_augmentation, 
                                        activation_func, optimizer, batch_size, num_epochs, 
                                        learning_rate, verbose)
        self.num_channels = num_channels
        self.mean_pool = mean_pool
        self.batch_norm = batch_norm
        self.use_skip = use_skip
        self.blocks = blocks
    
    
    def build_cnn(self,x):
        #Insert your code here
        x1 = layers.Conv2D(self.num_channels, (3, 3), strides=(1, 1), padding='same')(x)
        if self.batch_norm:
            x1 = layers.BatchNormalization() (x1)
        x1 = layers.Activation('relu') (x1)
        x2 = layers.Conv2D(self.num_channels, (3, 3), strides=(1, 1), padding='same')(x1)
        if self.batch_norm:
                x2 = layers.BatchNormalization()(x2)
        if x.shape != x2.shape:
            if x2.shape[3] > x.shape[3]:
                pad_tns = tf.constant([[0,0],[0, 0] ,[0,0], [x2.shape[3] - x.shape[3] , 0]])
                x = tf.pad(x, pad_tns, mode ='CONSTANT', constant_values=0)
            else:
                pad_tns = tf.constant([[0,0],[0, 0] ,[0,0], [x.shape[3] - x2.shape[3] , 0]])
                x = tf.pad(x2, pad_tns, mode ='CONSTANT', constant_values=0)
        x2_skip = layers.add([x, x2])
        x2_skip = layers.Activation('relu')(x2_skip)
        if self.mean_pool:
            output_layer = layers.AveragePooling2D(pool_size=(2, 2), padding='same')(x2_skip)
        else: 
            output_layer = x2_skip
        return output_layer
    
    def build_resnet(self):
        self.input_layer = layers.Input(shape=(self.width, self.height, self.depth))
        x = self.input_layer
        for i in range (self.blocks):
            x = self.build_cnn(x)
            self.num_channels = self.num_channels*2
        output_layer = GlobalAveragePooling2D()(x)
        output_layer = layers.Flatten()(output_layer)
        output_layer = layers.Dense(self.num_classes, activation='softmax')(output_layer)
        self.model = tf.keras.models.Model(inputs=self.input_layer, outputs=output_layer)
        self.model.compile(optimizer=self.optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
   
    def fit(self, data_manager, batch_size=None, num_epochs=None):
        #Insert your code here
        batch_size = self.batch_size if batch_size is None else batch_size
        num_epochs = self.num_epochs if num_epochs is None else num_epochs
        self.model.compile(optimizer=self.optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
        early_stopping = EarlyStopping(monitor='val_accuracy', patience=3)
        callbacks = [early_stopping]
        if self.is_augmentation == True:
            datagen = tf.keras.preprocessing.image.ImageDataGenerator(
                width_shift_range=0.05,
                zoom_range = 0.05,
            )
            datagen.fit(data_manager.X_train)
            it = datagen.flow(data_manager.X_train,data_manager.y_train, shuffle = False, batch_size =batch_size)
            self.history = self.model.fit(x = it, 
                          validation_data = (data_manager.X_valid, data_manager.y_valid), 
                          epochs = num_epochs, batch_size = batch_size, callbacks = callbacks,verbose= self.verbose)
        else:
            self.history = self.model.fit(x = data_manager.X_train, y = data_manager.y_train,
                                      validation_data = (data_manager.X_valid, data_manager.y_valid), 
                                      epochs = num_epochs, batch_size = batch_size, callbacks = callbacks,verbose= self.verbose)
        
In [178]:
testModel = YourModel(16,5, True, True,True, 0.001,True)
testModel.build_resnet()
mixup(data_manager,16)
testModel.fit(data_manager, batch_size = 16, num_epochs = 20)
testModel.compute_accuracy(data_manager.X_test, data_manager.y_test)
shape before mixup: X:  (7560, 32, 32, 3)  y:  (7560,)
Epoch 1/20
C:\Users\manut\anaconda3\envs\gpu\lib\site-packages\tensorflow\python\data\ops\structured_function.py:264: UserWarning: Even though the `tf.config.experimental_run_functions_eagerly` option is set, this option does not apply to tf.data functions. To force eager execution of tf.data functions, please use `tf.data.experimental.enable_debug_mode()`.
  warnings.warn(
591/591 [==============================] - 30s 51ms/step - loss: 2.6239 - accuracy: 0.1894 - val_loss: 2.4572 - val_accuracy: 0.2717
Epoch 2/20
591/591 [==============================] - 30s 50ms/step - loss: 2.3045 - accuracy: 0.2848 - val_loss: 2.2291 - val_accuracy: 0.3044
Epoch 3/20
591/591 [==============================] - 31s 52ms/step - loss: 2.1120 - accuracy: 0.3287 - val_loss: 1.8786 - val_accuracy: 0.4355
Epoch 4/20
591/591 [==============================] - 34s 57ms/step - loss: 1.9323 - accuracy: 0.3872 - val_loss: 2.2891 - val_accuracy: 0.3562
Epoch 5/20
591/591 [==============================] - 34s 57ms/step - loss: 1.7978 - accuracy: 0.4207 - val_loss: 1.7680 - val_accuracy: 0.4778
Epoch 6/20
591/591 [==============================] - 33s 56ms/step - loss: 1.6581 - accuracy: 0.4618 - val_loss: 1.7923 - val_accuracy: 0.4715
Epoch 7/20
591/591 [==============================] - 34s 58ms/step - loss: 1.5390 - accuracy: 0.4896 - val_loss: 1.9705 - val_accuracy: 0.4556
Epoch 8/20
591/591 [==============================] - 34s 58ms/step - loss: 1.3993 - accuracy: 0.5232 - val_loss: 1.7156 - val_accuracy: 0.4915
Epoch 9/20
591/591 [==============================] - 35s 59ms/step - loss: 1.2768 - accuracy: 0.5550 - val_loss: 1.7433 - val_accuracy: 0.5021
Epoch 10/20
591/591 [==============================] - 34s 58ms/step - loss: 1.1250 - accuracy: 0.5878 - val_loss: 1.7705 - val_accuracy: 0.5148
Epoch 11/20
591/591 [==============================] - 34s 57ms/step - loss: 0.9957 - accuracy: 0.6177 - val_loss: 1.8656 - val_accuracy: 0.5011
Epoch 12/20
591/591 [==============================] - 34s 58ms/step - loss: 0.8425 - accuracy: 0.6492 - val_loss: 1.8326 - val_accuracy: 0.5285
Epoch 13/20
591/591 [==============================] - 34s 58ms/step - loss: 0.7374 - accuracy: 0.6666 - val_loss: 2.0426 - val_accuracy: 0.4937
Epoch 14/20
591/591 [==============================] - 35s 59ms/step - loss: 0.6287 - accuracy: 0.6916 - val_loss: 2.2982 - val_accuracy: 0.4820
Epoch 15/20
591/591 [==============================] - 35s 59ms/step - loss: 0.5435 - accuracy: 0.7114 - val_loss: 2.0785 - val_accuracy: 0.5285
15/15 [==============================] - 0s 24ms/step - loss: 2.3560 - accuracy: 0.4831
Out[178]:
0.4830866754055023
In [170]:
testModel.plot_progress()
testModel.plot_prediction(data_manager.X_test, data_manager.y_test, data_manager.classes)
30/30 [==============================] - 0s 15ms/step
<Figure size 640x480 with 0 Axes>
In [179]:
testModel.model.save('models/data_mixup_model.h5')

**Question 3.5** Implement the one-versus-all (OVA) loss. The details are as follows:

  • You need to apply the sigmoid activation function to logits $h = [h_1, h_2,...,h_M]$ instead of the softmax activation function as usual to obtain $p = [p_1, p_2,...,p_M]$, meaning that $p_i = sigmoid(h_i), i=1,...,M$. Note that $M$ is the number of classes.
  • Given a data example $x$ with the ground-truth label $y$, the idea is to maximize the likelihood $p_y$ and to minimize the likelihoods $p_i, i \neq y$. Therefore, the objective function is to find the model parameters to
    • $\max\left\{ \log p_{y}+\sum_{i\neq y}\log(1-p_{i})\right\}$ or equivalently $\min\left\{ -\log p_{y}-\sum_{i\neq y}\log(1-p_{i})\right\}$.
    • For example, if $M=3$ and $y=2$, you need to minimize $\min\left\{ -\log(1-p_{1})-\log p_{2}-\log(1-p_{3})\right\}$.

Compare the model trained with the OVA loss and the same model trained with the standard cross-entropy loss.

[5 points]

Sometimes bug will occur when running OVA, please restart kernel and rerun the datasetmanager method before running this if the bug occurs

The OVA loss model performs slightly worse than the standard cross-entropy loss. The accuracies can be shown below:

With Data Augmentation + Early Stopping Accuracy:

  • Training Accuracy: 88.47%
  • Validation Accuracy: 56.98%
  • Test Accuracy: 57.5%

OVA Loss + With Data Augmentation + Early Stopping Accuracy:

  • Training Accuracy: 57.43%
  • Validation Accuracy: 50.11%
  • Test Accuracy: 52.8%

This value seems to vary run by run as OVA loss may sometimes exceed the performance of the standard CE loss model but the overall difference seems to be small.

In [13]:
tf.config.run_functions_eagerly(True)    
#Insert your code here. You can add more cells if necessary
class OVA_loss (tf.keras.losses.Loss):
    def __init__(self,eps=1E-10, num_classes = 20):
        super(OVA_loss, self).__init__()
        self.eps = eps
        self.num_classes = num_classes

    def call(self, y_true, y_pred):

        y_true_1_hot = tf.one_hot(tf.transpose(tf.cast(y_true, tf.int32), perm= [1,0])[0], depth= self.num_classes, axis=-1)
        loss_true_class = -tf.math.log(y_pred + self.eps) * y_true_1_hot
        # Find the index of the true class
        true_class_index = tf.argmax(y_true_1_hot, axis=-1)
        # Create a mask for the true class
        mask = tf.one_hot(true_class_index, depth= self.num_classes, on_value=1.0, off_value=0.0)
        # Invert the mask (0s become 1s, and 1s become 0s)
        y_true_modified = 1 - mask
                
        # Compute the negative log-likelihood for the other class
        loss_other_class = -tf.math.log(1.0 - y_pred + self.eps) * y_true_modified

        # Combine the losses for both classes
        total_loss = loss_true_class + loss_other_class

        return total_loss

class YourModel(DefaultModel):
    def __init__(self,num_channels,blocks,mean_pool,batch_norm,use_skip,learning_rate,verbose,
                 name='network1',
                 width=32, height=32, depth=3,
                 num_classes=20, 
                 is_augmentation = True,
                 activation_func='relu',
                 optimizer='adam',
                 batch_size=32,
                 num_epochs= 20):
        super(YourModel, self).__init__(name, width, height, depth, num_classes, is_augmentation, 
                                        activation_func, optimizer, batch_size, num_epochs, 
                                        learning_rate, verbose)
        self.num_channels = num_channels
        self.mean_pool = mean_pool
        self.batch_norm = batch_norm
        self.use_skip = use_skip
        self.blocks = blocks
    
    
    def build_cnn(self,x):
        #Insert your code here
        x1 = layers.Conv2D(self.num_channels, (3, 3), strides=(1, 1), padding='same')(x)
        if self.batch_norm:
            x1 = layers.BatchNormalization() (x1)
        x1 = layers.Activation('relu') (x1)
        x2 = layers.Conv2D(self.num_channels, (3, 3), strides=(1, 1), padding='same')(x1)
        if self.batch_norm:
                x2 = layers.BatchNormalization()(x2)
        if x.shape != x2.shape:
            if x2.shape[3] > x.shape[3]:
                pad_tns = tf.constant([[0,0],[0, 0] ,[0,0], [x2.shape[3] - x.shape[3] , 0]])
                x = tf.pad(x, pad_tns, mode ='CONSTANT', constant_values=0)
            else:
                pad_tns = tf.constant([[0,0],[0, 0] ,[0,0], [x.shape[3] - x2.shape[3] , 0]])
                x = tf.pad(x2, pad_tns, mode ='CONSTANT', constant_values=0)
        x2_skip = layers.add([x, x2])
        x2_skip = layers.Activation('relu')(x2_skip)
        if self.mean_pool:
            output_layer = layers.AveragePooling2D(pool_size=(2, 2), padding='same')(x2_skip)
        else: 
            output_layer = x2_skip
        return output_layer
    
    def build_resnet(self):
        self.input_layer = layers.Input(shape=(self.width, self.height, self.depth))
        x = self.input_layer
        for i in range (self.blocks):
            x = self.build_cnn(x)
            self.num_channels = self.num_channels*2
        output_layer = GlobalAveragePooling2D()(x)
        output_layer = layers.Flatten()(output_layer)
        output_layer = layers.Dense(self.num_classes, activation='sigmoid')(output_layer)
        self.model = tf.keras.models.Model(inputs=self.input_layer, outputs=output_layer)
        self.model.compile(optimizer=self.optimizer, loss= OVA_loss(), metrics=['accuracy'])
   
    def fit(self, data_manager, batch_size=None, num_epochs=None):
        #Insert your code here
        batch_size = self.batch_size if batch_size is None else batch_size
        num_epochs = self.num_epochs if num_epochs is None else num_epochs
        self.model.compile(optimizer=self.optimizer, loss= OVA_loss(), metrics=['accuracy'])
        early_stopping = EarlyStopping(monitor='val_accuracy', patience=6)
        callbacks = [early_stopping]
        if self.is_augmentation == True:
            datagen = tf.keras.preprocessing.image.ImageDataGenerator(
                brightness_range=(0.9, 1.1),
                width_shift_range=0.2,
                height_shift_range=0.1,
                rotation_range= 10,
            )
            datagen.fit(data_manager.X_train)
            it = datagen.flow(data_manager.X_train,data_manager.y_train, shuffle = False, batch_size =batch_size)
            self.history = self.model.fit(x = it, 
                          validation_data = (data_manager.X_valid, data_manager.y_valid), 
                          epochs = num_epochs, batch_size = batch_size, callbacks = callbacks,verbose= self.verbose)
        else:
            self.history = self.model.fit(x = data_manager.X_train, y = data_manager.y_train,
                                      validation_data = (data_manager.X_valid, data_manager.y_valid), 
                                      epochs = num_epochs, batch_size = batch_size, callbacks = callbacks,verbose= self.verbose)
        
In [14]:
testModel = YourModel(32,4, True, True,True, 0.001,True)
testModel.build_resnet()
mixup(data_manager,16)
testModel.fit(data_manager, batch_size = 32, num_epochs = 40)
testModel.compute_accuracy(data_manager.X_test, data_manager.y_test)
shape before mixup: X:  (7560, 32, 32, 3)  y:  (7560,)
Epoch 1/40
C:\Users\manut\anaconda3\envs\gpu\lib\site-packages\tensorflow\python\data\ops\structured_function.py:264: UserWarning: Even though the `tf.config.experimental_run_functions_eagerly` option is set, this option does not apply to tf.data functions. To force eager execution of tf.data functions, please use `tf.data.experimental.enable_debug_mode()`.
  warnings.warn(
355/355 [==============================] - 22s 54ms/step - loss: 0.1922 - accuracy: 0.1469 - val_loss: 0.1752 - val_accuracy: 0.2548
Epoch 2/40
355/355 [==============================] - 19s 54ms/step - loss: 0.1721 - accuracy: 0.2139 - val_loss: 0.1638 - val_accuracy: 0.2854
Epoch 3/40
355/355 [==============================] - 19s 54ms/step - loss: 0.1658 - accuracy: 0.2498 - val_loss: 0.1596 - val_accuracy: 0.3034
Epoch 4/40
355/355 [==============================] - 19s 53ms/step - loss: 0.1597 - accuracy: 0.2867 - val_loss: 0.1429 - val_accuracy: 0.3953
Epoch 5/40
355/355 [==============================] - 19s 54ms/step - loss: 0.1539 - accuracy: 0.3161 - val_loss: 0.1389 - val_accuracy: 0.4228
Epoch 6/40
355/355 [==============================] - 19s 54ms/step - loss: 0.1490 - accuracy: 0.3385 - val_loss: 0.1300 - val_accuracy: 0.4693
Epoch 7/40
355/355 [==============================] - 19s 54ms/step - loss: 0.1441 - accuracy: 0.3597 - val_loss: 0.1359 - val_accuracy: 0.4313
Epoch 8/40
355/355 [==============================] - 19s 54ms/step - loss: 0.1389 - accuracy: 0.3847 - val_loss: 0.1400 - val_accuracy: 0.4397
Epoch 9/40
355/355 [==============================] - 19s 54ms/step - loss: 0.1349 - accuracy: 0.4059 - val_loss: 0.1252 - val_accuracy: 0.5032
Epoch 10/40
355/355 [==============================] - 19s 55ms/step - loss: 0.1305 - accuracy: 0.4248 - val_loss: 0.1472 - val_accuracy: 0.4207
Epoch 11/40
355/355 [==============================] - 19s 54ms/step - loss: 0.1261 - accuracy: 0.4369 - val_loss: 0.1267 - val_accuracy: 0.5021
Epoch 12/40
355/355 [==============================] - 19s 54ms/step - loss: 0.1214 - accuracy: 0.4577 - val_loss: 0.1312 - val_accuracy: 0.4937
Epoch 13/40
355/355 [==============================] - 19s 54ms/step - loss: 0.1175 - accuracy: 0.4652 - val_loss: 0.1252 - val_accuracy: 0.5349
Epoch 14/40
355/355 [==============================] - 19s 55ms/step - loss: 0.1124 - accuracy: 0.4874 - val_loss: 0.1533 - val_accuracy: 0.4524
Epoch 15/40
355/355 [==============================] - 19s 54ms/step - loss: 0.1086 - accuracy: 0.4976 - val_loss: 0.1320 - val_accuracy: 0.5275
Epoch 16/40
355/355 [==============================] - 19s 54ms/step - loss: 0.1035 - accuracy: 0.5130 - val_loss: 0.1427 - val_accuracy: 0.5116
Epoch 17/40
355/355 [==============================] - 19s 54ms/step - loss: 0.0995 - accuracy: 0.5309 - val_loss: 0.1481 - val_accuracy: 0.4958
Epoch 18/40
355/355 [==============================] - 19s 54ms/step - loss: 0.0948 - accuracy: 0.5362 - val_loss: 0.1305 - val_accuracy: 0.5412
Epoch 19/40
355/355 [==============================] - 19s 54ms/step - loss: 0.0906 - accuracy: 0.5473 - val_loss: 0.1388 - val_accuracy: 0.5211
Epoch 20/40
355/355 [==============================] - 19s 55ms/step - loss: 0.0870 - accuracy: 0.5583 - val_loss: 0.1559 - val_accuracy: 0.4704
Epoch 21/40
355/355 [==============================] - 19s 54ms/step - loss: 0.0825 - accuracy: 0.5648 - val_loss: 0.1473 - val_accuracy: 0.4894
Epoch 22/40
355/355 [==============================] - 19s 54ms/step - loss: 0.0771 - accuracy: 0.5762 - val_loss: 0.1559 - val_accuracy: 0.4672
Epoch 23/40
355/355 [==============================] - 19s 54ms/step - loss: 0.0734 - accuracy: 0.5833 - val_loss: 0.1557 - val_accuracy: 0.5159
Epoch 24/40
355/355 [==============================] - 19s 54ms/step - loss: 0.0700 - accuracy: 0.5862 - val_loss: 0.1415 - val_accuracy: 0.5518
Epoch 25/40
355/355 [==============================] - 19s 54ms/step - loss: 0.0657 - accuracy: 0.5994 - val_loss: 0.1450 - val_accuracy: 0.5476
Epoch 26/40
355/355 [==============================] - 19s 54ms/step - loss: 0.0618 - accuracy: 0.6021 - val_loss: 0.1670 - val_accuracy: 0.4820
Epoch 27/40
355/355 [==============================] - 19s 54ms/step - loss: 0.0593 - accuracy: 0.6057 - val_loss: 0.1708 - val_accuracy: 0.4503
Epoch 28/40
355/355 [==============================] - 19s 54ms/step - loss: 0.0563 - accuracy: 0.6096 - val_loss: 0.1721 - val_accuracy: 0.4757
Epoch 29/40
355/355 [==============================] - 19s 55ms/step - loss: 0.0542 - accuracy: 0.6115 - val_loss: 0.1744 - val_accuracy: 0.4831
Epoch 30/40
355/355 [==============================] - 19s 54ms/step - loss: 0.0504 - accuracy: 0.6131 - val_loss: 0.1577 - val_accuracy: 0.5328
15/15 [==============================] - 0s 21ms/step - loss: 0.1532 - accuracy: 0.5338
Out[14]:
0.5338266491889954
In [15]:
testModel.plot_progress()
testModel.plot_prediction(data_manager.X_test, data_manager.y_test, data_manager.classes)
30/30 [==============================] - 0s 11ms/step
<Figure size 640x480 with 0 Axes>
In [16]:
testModel.model.save('models/ova_model.h5')

**Question 3.6** Attack your best obtained model with PGD, MIM, and FGSM attacks with $\epsilon= 0.0313, k=20, \eta= 0.002$ on the testing set. Write the code for the attacks and report the robust accuracies. Also choose a random set of 20 clean images in the testing set and visualize the original and attacked images.

[3 points]

Rerun Data Manager for this part to undo data mixup

In [184]:
sp = SimplePreprocessor(width=32, height=32)
data_manager = DatasetManager([sp])
data_manager.load(label_folder_dict, verbose=100)
data_manager.process_data_label()
data_manager.train_valid_test_split()
birds 512
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
bottles 432
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
breads 432
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
butterfiles 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
cakes 432
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
cats 501
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
chickens 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
cows 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
dogs 501
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
ducks 496
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
elephants 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
fishes 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
handguns 448
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
horses 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
lions 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
lipsticks 400
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
seals 448
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
snakes 496
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
spiders 500
Processed 100/500
Processed 200/500
Processed 300/500
Processed 400/500
Processed 500/500
vases 368
Processed 100/500
Processed 200/500
Processed 300/500
In [185]:
#FGSM Attack code adapted from Tut_06b
from tensorflow.keras.models import load_model
def fgsm_attack(model, input_image, input_label=None,
               epsilon=0.0313,
               clip_value_min=0.,
               clip_value_max=1.0,
               soft_label=False,
               from_logits=True):
    """
    Args:
        model: pretrained model
        input_image: original (clean) input image (tensor)
        input_label: original label (tensor, categorical representation)
        epsilon: perturbation boundary
        clip_value_min, clip_value_max: range of valid input
        from_logits = True: attack from logits otherwise attack from prediction probabilites
    Note:
        we expect the output of model should be logits vector
    """

    loss_fn = tf.keras.losses.sparse_categorical_crossentropy # compute CE loss from logits or prediction probabilities

    if type(input_image) is np.ndarray:
        input_image = tf.convert_to_tensor(input_image)

    if type(input_label) is np.ndarray:
        input_label = tf.convert_to_tensor(input_label)

    with tf.GradientTape() as tape:
        tape.watch(input_image)
        output = model(input_image)
        if not soft_label:
            loss = loss_fn(input_label, output, from_logits=from_logits) # use ground-truth label to attack
        else:
            pred_label = tf.math.argmax(output, axis=1) # use predicted label to attack
            loss = loss_fn(pred_label, output, from_logits=from_logits)

    gradient = tape.gradient(loss, input_image) # get the gradients of the loss w.r.t. the input image
    adv_image = input_image + epsilon * tf.sign(gradient) # get the final adversarial examples
    adv_image = tf.clip_by_value(adv_image, clip_value_min, clip_value_max) # clip to a valid range
    adv_image = tf.stop_gradient(adv_image) # stop the gradient to make the adversarial image as a constant input
    return adv_image
In [186]:
#PGD Attack code adapted from Tut_06b
def pgd_attack(model, input_image, input_label= None,
              epsilon=0.0313,
              num_steps=10,
              step_size=0.002,
              clip_value_min=0.,
              clip_value_max=1.0,
              soft_label=False,
              from_logits= False):
    """
    Args:
        model: pretrained model
        input_image: original (clean) input image (tensor)
        input_label: original label (tensor, categorical representation)
        epsilon: perturbation boundary
        num_steps: number of attack steps
        step_size: size of each move in each attack step
        clip_value_min, clip_value_max: range of valid input
        from_logits = True: attack from logits otherwise attack from prediction probabilites
    Note:
        we expect the output of model should be logits vector
    """

    loss_fn = tf.keras.losses.sparse_categorical_crossentropy  #compute CE loss from logits or prediction probabilities

    if type(input_image) is np.ndarray:
        input_image = tf.convert_to_tensor(input_image)

    if type(input_label) is np.ndarray:
        input_label = tf.convert_to_tensor(input_label)

    # random initialization around input_image
    random_noise = tf.random.uniform(shape=input_image.shape, minval=-epsilon, maxval=epsilon)
    adv_image = input_image + random_noise

    for _ in range(num_steps):
        with tf.GradientTape(watch_accessed_variables=False) as tape:
            tape.watch(adv_image)
            adv_output = model(adv_image)
            if not soft_label:
                loss = loss_fn(input_label, adv_output, from_logits= from_logits) # use ground-truth label to attack
            else:
                pred_label = tf.math.argmax(adv_output, axis=1)
                loss = loss_fn(pred_label, adv_output, from_logits= from_logits) # use predicted label to attack

        gradient = tape.gradient(loss, adv_image) # get the gradient of the loss w.r.t. the current point
        adv_image = adv_image + step_size * tf.sign(gradient) # move current adverarial example along the gradient direction with step size is eta
        adv_image = tf.clip_by_value(adv_image, input_image-epsilon, input_image+epsilon) # clip to a valid boundary
        adv_image = tf.clip_by_value(adv_image, clip_value_min, clip_value_max)  # clip to a valid range
        adv_image = tf.stop_gradient(adv_image) # stop the gradient to make the adversarial image as a constant input
    return adv_image
In [187]:
#MIM Attack code adapted from Tut_06b
def mim_attack(model, input_image, input_label= None,
              epsilon=0.0313,
              gamma= 0.9,
              num_steps=20,
              step_size=0.002,
              clip_value_min=0.,
              clip_value_max=1.0,
              soft_label=False,
              from_logits= True):
    """
    Args:
        model: pretrained model
        input_image: original (clean) input image (tensor)
        input_label: original label (tensor, categorical representation)
        epsilon: perturbation boundary
        gamma: momentum decay
        num_steps: number of attack steps
        step_size: size of each move in each attack step
        clip_value_min, clip_value_max: range of valid input
        from_logits = True: attack from logits otherwise attack from prediction probabilites
    Note:
        we expect the output of model should be logits vector
    """

    loss_fn = tf.keras.losses.sparse_categorical_crossentropy # compute CE loss from logits or prediction probabilities

    if type(input_image) is np.ndarray:
        input_image = tf.convert_to_tensor(input_image)

    if type(input_label) is np.ndarray:
        input_label = tf.convert_to_tensor(input_label)

    # random initialization around input_image
    random_noise = tf.random.uniform(shape=input_image.shape, minval=-epsilon, maxval=epsilon)
    adv_image = input_image + random_noise
    adv_noise = random_noise

    for _ in range(num_steps):
        with tf.GradientTape(watch_accessed_variables=False) as tape:
            tape.watch(adv_image)
            adv_output = model(adv_image)
            if not soft_label:
                loss = loss_fn(input_label, adv_output, from_logits=from_logits) # use ground-truth label to attack
            else:
                pred_label = tf.math.argmax(adv_output, axis=1)
                loss = loss_fn(pred_label, adv_output, from_logits=from_logits) # use predicted label to attack

        gradient = tape.gradient(loss, adv_image) # get the gradient of the loss w.r.t. the current point
        adv_image_new = adv_image + step_size * tf.sign(gradient) # move current adverarial example along the gradient direction with step size is eta
        adv_image_new = tf.clip_by_value(adv_image_new, input_image-epsilon, input_image+epsilon) # clip to a valid boundary
        adv_image_new = tf.clip_by_value(adv_image_new, clip_value_min, clip_value_max) # clip to a valid range
        adv_noise = gamma*adv_noise + (1-gamma)*(adv_image_new - adv_image)
        adv_image = adv_image_new
        adv_image = tf.stop_gradient(adv_image) # stop the gradient to make the adversarial image as a constant input
    adv_image = adv_image + adv_noise
    adv_image = tf.clip_by_value(adv_image, input_image-epsilon, input_image+epsilon) # clip to a valid boundary
    adv_image = tf.clip_by_value(adv_image, clip_value_min, clip_value_max) # clip to a valid range
    return adv_image
In [202]:
# Load the saved model
from matplotlib import pyplot as plt
from tensorflow.keras.applications.vgg19 import preprocess_input, decode_predictions

loaded_model = load_model('models/data_mixup_model.h5')

def attack_pgd ():
    random_indices = np.random.choice(len(data_manager.X_test), size=20, replace=False)
    correct = 0
    for i in random_indices:
        x = np.expand_dims(data_manager.X_test[i], axis=0)
        x = tf.cast(x, dtype=tf.float32)

        x_pgd= pgd_attack(loaded_model, x, data_manager.y_test[i])
        preds = loaded_model.predict(x)
        pgd_pred = loaded_model.predict(x_pgd)
        true_label= data_manager.classes[np.argmax(preds)]
        adv_label= data_manager.classes[np.argmax(pgd_pred)]
        if true_label == adv_label:
            correct += 1
        img = data_manager.X_test[i]
        img_pgd = np.squeeze(x_pgd.numpy())
        noise_pgd = np.clip(np.abs(img_pgd - img)*20, 0, 255).astype('int') # we multiply the noise by 20 for visualization
        fig = plt.figure(figsize=(15, 15*3))
        shown_img = img_pgd

        for i in range(3):
            shown_img = img if i==0 else noise_pgd if i==1 else img_pgd
            shown_label = 'Original image: {}'.format(true_label) if i==0 else 'Noise' if i==1 else 'Adversarial image: {}'.format(adv_label)
            plt.subplot(1,3,i+1)
            plt.imshow(shown_img)
            plt.xlabel(shown_label, fontsize= 12)
            plt.xticks([])
            plt.yticks([])
            plt.grid(False)
    return correct / 20 if correct > 0 else 0
attack_pgd()
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 27ms/step
C:\Users\manut\anaconda3\envs\gpu\lib\site-packages\tensorflow\python\data\ops\structured_function.py:264: UserWarning: Even though the `tf.config.experimental_run_functions_eagerly` option is set, this option does not apply to tf.data functions. To force eager execution of tf.data functions, please use `tf.data.experimental.enable_debug_mode()`.
  warnings.warn(
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 33ms/step
1/1 [==============================] - 0s 32ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 26ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 31ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Out[202]:
0.15
In [189]:
def attack_fgsm ():
    random_indices = np.random.choice(len(data_manager.X_test), size=20, replace=False)
    correct = 0
    for i in random_indices:
        x = np.expand_dims(data_manager.X_test[i], axis=0)
        x = tf.cast(x, dtype=tf.float32)

        x_pgd= fgsm_attack(loaded_model, x, data_manager.y_test[i])
        preds = loaded_model.predict(x)
        pgd_pred = loaded_model.predict(x_pgd)
        true_label= data_manager.classes[np.argmax(preds)]
        adv_label= data_manager.classes[np.argmax(pgd_pred)]
        if true_label == adv_label:
            correct += 1
        img = data_manager.X_test[i]
        img_pgd = np.squeeze(x_pgd.numpy())
        noise_pgd = np.clip(np.abs(img_pgd - img)*20, 0, 255).astype('int') # we multiply the noise by 20 for visualization
        fig = plt.figure(figsize=(15, 15*3))
        shown_img = img_pgd

        for i in range(3):
            shown_img = img if i==0 else noise_pgd if i==1 else img_pgd
            shown_label = 'Original image: {}'.format(true_label) if i==0 else 'Noise' if i==1 else 'Adversarial image: {}'.format(adv_label)
            plt.subplot(1,3,i+1)
            plt.imshow(shown_img)
            plt.xlabel(shown_label, fontsize= 12)
            plt.xticks([])
            plt.yticks([])
            plt.grid(False)
    return correct / 20 if correct > 0 else 0
attack_fgsm()
1/1 [==============================] - 0s 36ms/step
1/1 [==============================] - 0s 34ms/step
C:\Users\manut\anaconda3\envs\gpu\lib\site-packages\keras\backend.py:5582: UserWarning: "`sparse_categorical_crossentropy` received `from_logits=True`, but the `output` argument was produced by a Softmax activation and thus does not represent logits. Was this intended?
  output, from_logits = _get_logits(
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 28ms/step
1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 26ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 32ms/step
1/1 [==============================] - 0s 32ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 29ms/step
1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 27ms/step
1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 31ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 31ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 26ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 27ms/step
1/1 [==============================] - 0s 26ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 32ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 26ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 34ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 35ms/step
1/1 [==============================] - 0s 30ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 27ms/step
1/1 [==============================] - 0s 25ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 32ms/step
1/1 [==============================] - 0s 32ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Out[189]:
0.05
In [190]:
def attack_mim ():
    random_indices = np.random.choice(len(data_manager.X_test), size=20, replace=False)
    correct = 0
    for i in random_indices:
        x = np.expand_dims(data_manager.X_test[i], axis=0)
        x = tf.cast(x, dtype=tf.float32)

        x_pgd= fgsm_attack(loaded_model, x, data_manager.y_test[i])
        preds = loaded_model.predict(x)
        pgd_pred = loaded_model.predict(x_pgd)
        true_label= data_manager.classes[np.argmax(preds)]
        adv_label= data_manager.classes[np.argmax(pgd_pred)]
        if true_label == adv_label:
            correct += 1
        img = data_manager.X_test[i]
        img_pgd = np.squeeze(x_pgd.numpy())
        noise_pgd = np.clip(np.abs(img_pgd - img)*20, 0, 255).astype('int') # we multiply the noise by 20 for visualization
        fig = plt.figure(figsize=(15, 15*3))
        shown_img = img_pgd

        for i in range(3):
            shown_img = img if i==0 else noise_pgd if i==1 else img_pgd
            shown_label = 'Original image: {}'.format(true_label) if i==0 else 'Noise' if i==1 else 'Adversarial image: {}'.format(adv_label)
            plt.subplot(1,3,i+1)
            plt.imshow(shown_img)
            plt.xlabel(shown_label, fontsize= 12)
            plt.xticks([])
            plt.yticks([])
            plt.grid(False)
    return correct / 20 if correct > 0 else 0
attack_mim()
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 30ms/step
1/1 [==============================] - 0s 33ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 27ms/step
1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 28ms/step
1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 26ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 28ms/step
1/1 [==============================] - 0s 29ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 26ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 26ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 30ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 30ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 30ms/step
1/1 [==============================] - 0s 31ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 26ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 26ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 26ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Out[190]:
0.25

**Question 3.7** Train a robust model using adversarial training with PGD ${\epsilon= 0.0313, k=10, \eta= 0.002}$. Write the code for the adversarial training and report the robust accuracies. After finishing the training, you need to store your best robust model in the folder ./models and load the model to evaluate the robust accuracies for PGD, MIM, and FGSM attacks with $\epsilon= 0.0313, k=20, \eta= 0.002$ on the testing set.

[4 points]

Accuracy before training

  • FGSM Attack: 5%
  • PGD Attack: 15%
  • MIM Attack: 25%

Accuracy after training and attacking with step size 20

  • FGSM Attack: 31%
  • PGD Attack: 24%
  • MIM Attack: 27%

As we can see, the adversarial training here has successfully reinforced our model against attacks and it is now able to give higher percentage of correct results after just 5 epochs.

In [191]:
from sklearn.metrics import accuracy_score
lenet_defence = loaded_model

#Insert your code here. You can add more cells if necessary
optimizer = tf.optimizers.Adam(learning_rate=0.001)
loss_obj = tf.nn.sparse_softmax_cross_entropy_with_logits

# metrics to track the different accuracies.
train_loss = tf.metrics.Mean(name='train_loss')
test_acc_clean = tf.metrics.SparseCategoricalAccuracy()
test_acc_pgd = tf.metrics.SparseCategoricalAccuracy()
batch_size = 32

def train_step_adv(x, x_adv, y):
    with tf.GradientTape() as tape:
        logits = lenet_defence(x)
        logits_adv = lenet_defence(x_adv)
        loss = (loss_obj(y, logits) + loss_obj(y, logits_adv))/2
        gradients = tape.gradient(loss, lenet_defence.trainable_variables)
        optimizer.apply_gradients(zip(gradients, lenet_defence.trainable_variables))
    return loss

epochs = 5 # number of epochs
for epoch in range(epochs):
    print ("\n")
    print ("epoch: ",epoch)
    # keras like display of progress
    progress_bar_train = tf.keras.utils.Progbar(data_manager.X_train.shape[0], verbose=1)
    for i in range (int (data_manager.X_train.shape[0]/ batch_size)):
        current_data = data_manager.next_batch(batch_size)
        (x,y) = current_data
        x = tf.cast(x, dtype=tf.float32)
        # train based on pgd attack according to the perimeters given 
        x_adv = pgd_attack(lenet_defence, x, y,  epsilon=0.0313,
              num_steps=10,
              step_size=0.002,
              clip_value_min=0.,
              clip_value_max=1.0,
              soft_label=False,
              from_logits= False)
        loss = train_step_adv(x, x_adv, y)
        y_pred = lenet_defence(x)
        test_acc_clean(y, y_pred)
        test_acc_pgd(y, lenet_defence(x_adv))
        train_loss(loss)
        progress_bar_train.add(x.shape[0], values=[('loss', train_loss.result()), ("acc (%)", test_acc_clean.result() * 100),("pgd (%)", test_acc_pgd.result() * 100)])
    print()
        

epoch:  0
7552/7560 [============================>.] - ETA: 0s - loss: 2.6384 - acc (%): 80.8675 - pgd (%): 17.7838


epoch:  1
7552/7560 [============================>.] - ETA: 0s - loss: 2.6217 - acc (%): 78.2272 - pgd (%): 22.1566


epoch:  2
7552/7560 [============================>.] - ETA: 0s - loss: 2.6073 - acc (%): 78.3158 - pgd (%): 24.6540


epoch:  3
7552/7560 [============================>.] - ETA: 0s - loss: 2.5977 - acc (%): 78.5100 - pgd (%): 26.2517


epoch:  4
7552/7560 [============================>.] - ETA: 0s - loss: 2.5896 - acc (%): 78.7079 - pgd (%): 27.5303
In [197]:
y_batch_adv = []
y_adv = []
y_true = []

for i in range(data_manager.X_test.shape[0] // batch_size):
    idx = data_manager.random.choice(data_manager.X_test.shape[0], batch_size, 
                                 replace=batch_size > data_manager.X_test.shape[0])
    x,y = data_manager.X_test[idx], data_manager.y_test[idx] 
    x_fgsm = fgsm_attack(lenet_defence, tf.cast(x, tf.float32), y, epsilon = 0.0313,
                           soft_label=False, clip_value_min=0.0, clip_value_max=255.0, from_logits=False)
    
    y_batch_adv = np.argmax(loaded_model(x_fgsm).numpy(), 1)
    y_adv.append(y_batch_adv[0].tolist())
    y_true.append(y[0].tolist())
    
    test_adv_acc = accuracy_score(y_true, y_adv)
print("FGSM attack accuracy:{}".format(test_adv_acc))
FGSM attack accuracy:0.3103448275862069
In [201]:
y_batch_adv = []
y_adv = []
y_true = []

for i in range(data_manager.X_test.shape[0] // batch_size):
    idx = data_manager.random.choice(data_manager.X_test.shape[0], batch_size, 
                                 replace=batch_size > data_manager.X_test.shape[0])
    x,y = data_manager.X_test[idx], data_manager.y_test[idx] 
    x_pgd = pgd_attack(lenet_defence, tf.cast(x, tf.float32), y, num_steps=20, step_size= 0.002, epsilon = 0.0313,
                   soft_label=False, clip_value_min=0.0, clip_value_max=255.0, from_logits=False)
    
    y_batch_adv = np.argmax(loaded_model(x_pgd).numpy(), 1)
    y_adv.append(y_batch_adv[0].tolist())
    y_true.append(y[0].tolist())
    
    test_adv_acc = accuracy_score(y_true, y_adv)
print("PGD attack accuracy:{}".format(test_adv_acc))
PGD attack accuracy:0.2413793103448276
In [199]:
y_batch_adv = []
y_adv = []
y_true = []

for i in range(data_manager.X_test.shape[0] // batch_size):
    idx = data_manager.random.choice(data_manager.X_test.shape[0], batch_size, 
                                 replace=batch_size > data_manager.X_test.shape[0])
    x,y = data_manager.X_test[idx], data_manager.y_test[idx] 
    x_mim = mim_attack(lenet_defence, tf.cast(x, tf.float32), y, num_steps=20, step_size= 0.002, epsilon = 0.0313,
                   soft_label=False, clip_value_min=0.0, clip_value_max=255.0, from_logits=False)
    
    y_batch_adv = np.argmax(loaded_model(x_mim).numpy(), 1)
    y_adv.append(y_batch_adv[0].tolist())
    y_true.append(y[0].tolist())
    
    test_adv_acc = accuracy_score(y_true, y_adv)
print("MIM attack accuracy:{}".format(test_adv_acc))
MIM attack accuracy:0.27586206896551724
In [ ]:
 

**Question 3.8 (Kaggle competition)**

[10 points] </div10

You can reuse the best model obtained in this assignment or develop new models to evaluate on the testing set of the FIT5215 Kaggle competion. However, to gain any points for this question, your testing accuracy must exceed the accuracy threshold from a base model developed by us as shown in the leader board of the competition.

The marks for this question are as follows:

  • If you are in top 10%, you gain 10 points.
  • If you are in top 20%, you gain 8 points.
  • If you are in top 30%, you gain 6 points.
  • If you win our base model, you gain 4 points.

**Tips and requirements**

  • Your team name or member name using in this Kaggle competion must contain your student ID, which faciliates us in marking this question.
  • You can use any deep/machine techniques in this Kaggle competition.
  • We apply some slight transformations and add noises to unseen testing images to make the task more challenging. There is a slight shift between training distribution and testing distribution.
  • You must submit your code, trained model, and a brief document decribed your method followed a provided template.

END OF ASSIGNMENT
GOOD LUCK WITH YOUR ASSIGNMENT 1!